LLMsMay 23, 2026

The /goal Command: A Complete Guide to Autonomous Agents on Codex, Hermes & Claude Code

/goalAI codeLLMVibe coding

Summary: In the span of six weeks in early 2026, three major AI coding platforms — OpenAI Codex, Nous Research Hermes, and Anthropic Claude Code — each shipped a /goal command. This wasn't coincidence. It was the industry converging on a shared interface for the next generation of autonomous agents. This guide covers what /goal is, how each platform implements it differently, and how to hand off work across platforms effectively.


1. What Is /goal?

1.1 From Prompt-Response to Goal-Driven Execution

Traditional AI coding assistants work in single turns: you give an instruction, it responds, you send the next instruction. You act as the supervisor — approving every step, manually continuing the loop.

/goal ends that pattern. It introduces the concept of a Persistent Goal: you define a completion condition, and the agent autonomously works across multiple turns until that condition is met — without you intervening at every step.

Core idea: Turn "keep going" into a contract. You give the agent an outcome, a definition of done, and a way to verify progress. The agent keeps working until it reaches that outcome, runs out of budget, pauses, or hits a genuine blocker it can't resolve alone.

1.2 The Architecture: Evaluator Loop

The key technical innovation behind /goal is the separation of executor and judge:

  • Worker model: Does the actual coding, testing, refactoring
  • Evaluator model (Judge): After each turn, a separate lightweight model asks: "Has the goal been met?"
    • ✅ Yes → return control to the user
    • ❌ No → automatically start the next turn

This separation matters because, as one industry observer put it: "The model doing the work is the worst judge of whether it's done." Splitting the two roles is what makes a reliable autonomous loop possible.

JavaScript
User sets /goal │ ▼ ┌─────────────┐ │ Worker │ ← writes code, runs tests, fixes bugs │ Model │ └──────┬──────┘ │ turn complete ▼ ┌─────────────┐ │ Evaluator │ ← checks completion condition │ (Judge) │ └──────┬──────┘ │ ┌───┴───┐ Yes No │ │ ▼ ▼ Done Next turn (return (continue to user) automatically)

1.3 Best-Fit Task Types

  • Multi-step engineering tasks: module migrations, test suite repair, full-directory refactors
  • Tasks with measurable end states: "all tests pass", "build exit code is 0", "file count reaches N"
  • Long-running background tasks: database optimization, performance tuning, documentation generation
  • Highly subjective tasks: design decisions, tasks requiring aesthetic judgment or human taste
  • High-risk irreversible operations: these warrant human confirmation checkpoints

2. Platform Implementations

2.1 OpenAI Codex — The Execution Engine

HHMZMO4acAAe8AW.jpg
HHMZMO4acAAe8AW.jpg

OpenAI Codex CLI 。Source: @hqmank

Released: April 2026 (experimental), promoted to stable

Core positioning: Codex is an implementation-focused coding agent — give it a clear spec, it builds. /goal is the mechanism for delivering that spec in a durable, persistent way.

How It Works

Under the hood, Codex maintains a thread_goals database table that tracks each goal's status, token budget, and elapsed time. Goals have a formal lifecycle: activepaused / budget_limitedcomplete.

A deliberate design asymmetry: the model can start and declare complete a goal, but pause / resume / budget management is controlled by the user or system runtime. The tool spec explicitly states: "Create a goal only when explicitly requested by the user; do not infer goals from ordinary tasks."

Usage

Shell
# Launch interactive Codex session codex # Set a goal /goal Optimize database queries in db.ts Constraints: - Keep schema unchanged - Cover all execution paths with tests - Target execution time below 50ms # Subcommands /goal pause # pause the active goal /goal resume # resume a paused goal /goal clear # clear the current goal

Codex-Specific Strengths

  • Cross-session persistence: goal state stored in a database; closing the terminal doesn't lose progress
  • Multi-environment support: per-turn environment switching (dev / staging / remote)
  • AWS Bedrock integration: native SigV4 signing for AWS-native teams
  • Remote Computer Use: continues working even when your Mac screen locks; pairs with Codex Mobile for remote monitoring
  • External agent import: migrate sessions from other agent harnesses into Codex

tab: source — OpenAI Codex Changelog: https://developers.openai.com/codex/changelog | Kingy AI analysis: https://kingy.ai/ai/openai-codex-goal-the-new-long-horizon-mode-for-agentic-coding/ | GitHub implementation gist: https://gist.github.com/patleeman/b1b5768393f9bf2f60865b1defeeb819


2.2 Nous Research Hermes — The Multi-Agent Orchestrator

image.png
image.png

Hermes Agent orchestrating tasks across a Kanban board. Source: The End of the “Human Heartbeat”: How the /goal Command is Redefining AI Agents

Released: v0.13.0 — May 7, 2026 (Tenacity Release)

Core positioning: Hermes is not a coding worker — it's a multi-agent orchestrator. It doesn't write code itself; it coordinates Codex, Claude Code, and other tools to write code, managing every handoff in between.

How It Works: The Ralph Loop

Hermes calls its /goal implementation its take on the "Ralph Loop" — a stateful autonomous loop with a Judge Model, a configurable turn budget (default: 20 turns), and cross-session persistence:

  1. User sends a goal via any platform (Telegram, Discord, Slack, CLI…)
  2. Hermes creates task cards on an internal Kanban board
  3. It selects the right tool for each card (Codex to build, Claude Code to review…)
  4. A Judge Model checks after each turn whether the goal is complete
  5. If not done, auto-continues; if done, sends a summary report to the user

Usage

JavaScript
# Works from Telegram, Discord, Slack, Matrix, Signal, CLI — all the same /goal Fix all failing tests in this repo Requirement: run test commands, identify failures, patch changes one at a time, until all tests pass # Subcommands /goal status # view active goal state /goal pause # pause execution /goal resume # resume /goal clear # clear the goal

Hermes-Specific Strengths

  • Cross-platform messaging interface: Telegram, Discord, Slack, Matrix, Signal — no terminal needed
  • Built-in Judge Model: independent evaluator with a configurable turn budget
  • Kanban board integration: goals auto-decomposed into tasks; supports multi-agent parallel execution
  • Skills system: installed skills exposed as dynamic slash commands, including /plan (open planning mode)
  • Permission tiers: admin vs. regular user command access control per platform group
  • Codex CLI runtime integration (v0.13.0 beta): Hermes can hand shell execution and file patches directly to the Codex CLI, keeping its own memory, sessions DB, and /goal intact

tab: source — Hermes Slash Commands Reference: https://hermes-agent.nousresearch.com/docs/reference/slash-commands | Geeky Gadgets guide: https://www.geeky-gadgets.com/automate-tasks-hermes-ai/ | AlphaSignal analysis: https://alphasignalai.substack.com/p/hermes-just-made-codex-the-engine


2.3 Anthropic Claude Code — Verification-Driven Agent

image.png
image.png

Claude Code /goal live status overlay showing elapsed time, turns, and tokens. Source: joe.njenga

Released: May 12, 2026 (Claude Code v2.1.139)

Core positioning: Claude Code excels at finding what's wrong with code that looks right — spec compliance violations, security holes, error states, edge cases. /goal lets you point it at a codebase and ask it to keep working until it's genuinely clean.

How It Works

Claude Code's /goal uses the Hooks system to implement the evaluator loop:

  • After each turn, a lightweight, fast evaluator model checks whether the completion condition is satisfied
  • A live overlay panel shows elapsed time, turn count, and token consumption in real time
  • Available in three modes: interactive, headless (****-p flag), and Remote Control

Usage

Shell
# Requires Claude Code v2.1.139 or later # Workspace trust dialog must be accepted /goal All tests pass and CI pipeline is green # Claude keeps working autonomously until the condition is met # Pair with auto mode for fully unattended runs: claude --auto # auto mode approves tool calls within a turn # /goal starts the next turn automatically # Headless / scheduled execution: claude -p "<goal description>"

Claude Code-Specific Strengths

  • Real-time overlay panel: live elapsed/turns/tokens — highest transparency of the three platforms
  • Hooks system integration: evaluator is deeply wired into the existing hooks architecture; highly customizable
  • Three run modes: interactive, -p (headless), and Remote Control for CI/CD or remote servers
  • Agent View (Research Preview): a single list of every Claude Code session — running, blocked, or done
  • MCP integration: natively extensible via Model Context Protocol

tab: source — Claude Code v2.1.139 release notes: https://releasebot.io/updates/anthropic/claude-code | MindStudio guide: https://www.mindstudio.ai/blog/claude-code-goal-command-autonomous-tasks | Field guide with examples: https://medium.com/@jason.croucher/claude-code-goal-a-field-guide-with-games-f6f3b617ce5b


3. Platform Comparison at a Glance

DimensionCodex /goalHermes /goalClaude Code /goal
RoleCoding execution engineMulti-agent orchestratorVerification-driven code agent
ReleasedApril 2026May 7, 2026 (v0.13.0)May 12, 2026 (v2.1.139)
EvaluatorState machine + model self-evalIndependent Judge ModelIndependent lightweight model (Hooks)
State persistence✅ Database, cross-session✅ Persistent, cross-session⚠️ In-session; restart needed after close
Turn budgetToken budget (configurable)Default 20 turns (configurable)No hard cap (set in condition or use Ctrl+C)
InterfaceCLI / IDE / Desktop AppCLI + messaging platforms (Telegram etc.)Interactive / -p / Remote Control
Best atLong implementation runs, env switchingMulti-agent coordination, cross-tool handoffsCode review, test repair, CI integration
Key integrationsAWS Bedrock, external agent importCodex CLI, Claude Code, Kanban boardMCP, CI/CD pipelines, Hooks system

4. Handoff Recommendations Across Platforms

4.1 The Core Principle: Hermes Directs, Codex Builds, Claude Code Verifies

All three platforms share the same command format — intentionally. This makes them composable. The most powerful workflow chains them:

JavaScript
You → Hermes (/goal + high-level objective) │ ├──→ Codex (implement the feature, write code) │ ├──→ Claude Code (audit for bugs, verify tests) │ └──→ Hermes (verify + send summary report) → You

You never open a terminal. One message in. One summary out.

4.2 Scenario-Based Handoff Guide

Scenario A: New Feature Development

Recommended: Hermes /goal → Codex builds → Claude Code reviews

JavaScript
# Send to Hermes (via Telegram / CLI) /goal Add OAuth2 login to the user module Constraints: - Use existing User database schema - Include unit tests and integration tests - Done when all tests pass

Hermes auto-decomposes this into Kanban tasks, assigns implementation to Codex, then routes the result to Claude Code for security review.

Scenario B: Fixing a Broken CI Pipeline

Recommended: Claude Code /goal directly

Shell
/goal All CI tests pass # Claude Code will keep diagnosing and patching # failures one by one until the suite is clean

Claude Code's strength is finding code that looks right but isn't — ideal for tracking down flaky tests and subtle regressions.

Scenario C: Overnight Long-Running Tasks

Recommended: Codex Goal Mode (macOS desktop app)

Codex supports Remote Computer Use: it keeps running even when your Mac screen locks. Pair with Codex Mobile for remote status monitoring.

Shell
# Set the goal, then safely lock your screen /goal Migrate entire /src/legacy directory to the new module architecture Constraints: - All public API interfaces remain unchanged - Each module must have corresponding tests - Done when all original tests still pass

Scenario D: Recurring Maintenance (Weekly Tasks)

Recommended: Hermes (self-hosted + messaging platform trigger)

Hermes is built for recurring engineering workflows that span multiple coding sessions — e.g., a weekly goal to "sync dependency versions" or "clear stale TODOs". Trigger via Telegram message, get a summary in the same chat.

4.3 Universal Principles for Writing Good /goal Prompts

Regardless of platform, a strong /goal has these traits:

1. Completion condition must be verifiable

  • "All tests pass" (measurable)
  • "Build exit code is 0" (measurable)
  • "The code looks cleaner" (not verifiable)

2. State constraints explicitly, not just goals

Tell the agent what must not change:

  • Keep schema unchanged
  • Keep Lighthouse score above 90
  • Keep existing UI intact

3. Set a budget for long runs

  • Hermes: configure turn_budget
  • Codex: configure token_budget
  • Claude Code: write a turn limit into the condition itself, or be ready to Ctrl+C

4. Always run inside a Git repo

Run git init before starting. Agents can change many files quickly. Being able to git diff or git checkout is your most important safety net.

5. Never leave an open-ended goal running unattended overnight

Even with a budget, maintain monitoring habits. "Optimize the entire codebase" is not a goal — it has no stopping condition.


5. Industry Significance

Three teams at three different companies converged on the same primitive within six weeks. That convergence is the signal.

As VentureBeat reported, separating the builder from the judge is "sound design — fundamentally, you can't trust a model to judge its own homework." The appearance of independent evaluators in agent loops marks a meaningful shift toward auditable, observable agentic systems.

The deeper pattern, noted by multiple analysts: all three accept the same command format, which means they compose. For the first time, a single message to Hermes can trigger Codex to build, Claude Code to review, and Hermes to verify — with no terminal interaction from the developer. /goal is the shared protocol that makes multi-agent pipelines practical.

The role of the developer is shifting from step-by-step supervisor to outcome definer — and /goal is the interface that makes that shift concrete.


References

#SourceURL
1OpenAI Codex Changeloghttps://developers.openai.com/codex/changelog
2Deep Dive: Master the New /goal Command in OpenAI Codex (Medium / proflead)https://medium.com/@proflead/deep-dive-master-the-new-goal-command-in-openai-codex-65428c307e85
3How to Use OpenAI Codex's /goal Command (MindStudio)https://www.mindstudio.ai/blog/openai-codex-goal-command-autonomous-tasks
4How OpenAI Codex Implements /goal — GitHub Gist (patleeman)https://gist.github.com/patleeman/b1b5768393f9bf2f60865b1defeeb819
5OpenAI Codex /goal: Long-Horizon Mode for Agentic Coding (Kingy AI)https://kingy.ai/ai/openai-codex-goal-the-new-long-horizon-mode-for-agentic-coding/
6Codex /goal vs Claude Code Agents: 2026 Comparison (devtoolpicks.com)https://devtoolpicks.com/blog/codex-goal-command-vs-claude-code-agents-2026
7Hermes Slash Commands Reference (Nous Research GitHub)https://github.com/NousResearch/hermes-agent/blob/main/website/docs/reference/slash-commands.md
8Hermes Slash Commands Reference (Official Docs)https://hermes-agent.nousresearch.com/docs/reference/slash-commands
9Hermes Agent /goal Feature Guide (Geeky Gadgets)https://www.geeky-gadgets.com/automate-tasks-hermes-ai/
10Hermes Agent v0.13 Reference — Tenacity Release (blakecrosley.com)https://blakecrosley.com/guides/hermes
11Hermes /goal: Only Works If You Define "Done" Properly (JQ AI Systems)https://www.ai.joaoqueiros.com/blog/hermes-goal-agent-workflows
12Hermes Just Made Codex the Engine and Itself the Shell (AlphaSignal)https://alphasignalai.substack.com/p/hermes-just-made-codex-the-engine
13The /goal Command: Codex and Claude Code as 24/7 Autonomous Agents (APIdog)https://apidog.com/blog/goal-command-codex-claude-code-autonomous-agents/
14How Hermes, Codex, and Claude Code Use /goal (Rahul Goyal)https://rahulgoyal.co/justdraft/goal-command-coding-agents/
15The Ultimate Guide to /goal (The Unwind AI)https://www.theunwindai.com/p/the-ultimate-guide-to-goal
16Claude Code 2.1.139 Adds /goal Command (explainx.ai)https://explainx.ai/blog/claude-code-goal-command-long-running-agents-2026
17Introduction to Claude Code Goal Mode — 6 Key Points (apiyi.com)https://help.apiyi.com/en/claude-code-goal-mode-keep-working-until-done-guide-en.html
18Claude Code Updates May 2026 (Releasebot)https://releasebot.io/updates/anthropic/claude-code
19Claude Code /goal: A Field Guide with Games (Medium / Jason Croucher)https://medium.com/@jason.croucher/claude-code-goal-a-field-guide-with-games-f6f3b617ce5b
20What Is the /goal Command in Claude Code? (MindStudio)https://www.mindstudio.ai/blog/claude-code-goal-command-autonomous-tasks
21Claude Code's /goals Separates the Agent That Works from the One That Decides It's Done (VentureBeat)https://venturebeat.com/orchestration/claude-codes-goals-separates-the-agent-that-works-from-the-one-that-decides-its-done
22Goal Mode for AI Agents: OpenClaw, Hermes, and Codex (explainx.ai)https://explainx.ai/blog/goal-mode-ai-agents-complete-guide-2026
23/goal Is the Most Underrated AI Feature of 2026 (Medium / Coding Nexus)https://medium.com/coding-nexus/goal-is-the-most-underrated-ai-feature-of-2026-heres-how-to-use-it-right-96f265344530
24Codex /goal vs Claude Code /goal (knightli.com)https://www.knightli.com/en/2026/05/14/codex-goal-vs-claude-code-goal/
25Codex /goal and Claude Managed Outcomes: New Control Loops (Developers Digest)https://www.developersdigest.tech/blog/codex-goal-vs-claude-managed-outcomes-practical-differences