Codex Agent
Delegate tasks to independent Codex sessions for execution via Codex CLI.
Prerequisites
- Install Codex CLI:
npm install -g @openai/codex
- Ensure you have completed Codex login authentication (the first run of will guide you through login)
- It is recommended to run in the target project directory, or explicitly pass
Usage Methods
Create a New Session
bash
codex exec --json --sandbox workspace-write --skip-git-repo-check --model gpt-5.4 "Your task description"
The above command is only suitable for short prompts. When the prompt exceeds approximately 500 characters or contains multiple lines/special characters, do not continue using positional parameters; using stdin is more stable.
Output is in JSONL format, with one event per line. Common current events:
jsonl
{"type":"thread.started","thread_id":"019d32fc-..."}
{"type":"turn.started"}
{"type":"item.completed","item":{"id":"item_0","type":"agent_message","text":"Response content"}}
{"type":"turn.completed","usage":{"input_tokens":46879,"cached_input_tokens":2432,"output_tokens":54}}
- Extract from the event for subsequent multi-turn conversations
- Extract from the event (where
item.type == "agent_message"
) as Codex's response
- Extract from to record token consumption
Only parse JSON lines in automation scripts. If your runtime environment merges
warnings with
, first filter out JSON lines starting with
, or redirect
.
Use stdin Pipe for Long Prompts
When the prompt exceeds approximately 500 characters or contains multiple lines/special characters, do not continue using positional parameters. Such inputs may get stuck at
Reading additional input from stdin...
in practice, even if you manually close stdin. It is recommended to write the prompt to a file first, then let
read from stdin using
:
bash
codex exec --json --sandbox workspace-write --full-auto --model gpt-5.4 \
--skip-git-repo-check - < /tmp/task-prompt.txt > /tmp/task-out.jsonl 2>&1
Positional parameters are still suitable for short prompts, such as one-sentence Q&A, very short follow-up questions, or temporary command-line experiments.
Resume a Session
bash
codex exec resume --json --model gpt-5.4 "thread_id" "Follow-up question"
Resume the Most Recent Session (Shortcut)
bash
codex exec resume --json --model gpt-5.4 --last "Follow-up question"
- only looks at the most recent session recorded in the current directory by default
- Add when you need to search across directories
⚠️ : Non-recoverable Session
After adding
, the session
will not be written to disk (not persisted to
), so it
cannot be recovered by
or
afterwards. Use only in the following scenarios:
- One-time quick Q&A where you are certain no follow-up will be needed
- Temporary calls in CI/scripts to avoid polluting the session list
- Sensitive tasks where you do not want local records to be left behind
Do not add if you may need to follow up later. It is semantically equivalent to
claude -p --no-session-persistence
.
codex exec Parameters
Output and Results
| Flag | Description |
|---|
| Output in JSONL format for easy event stream parsing |
| Use JSON Schema to constrain the structure of the last message |
-o, --output-last-message FILE
| Write the last message directly to a file |
| Color output: / / |
Execution Environment
| Flag | Description |
|---|
| Sandbox mode: / / |
| Currently a shortcut equivalent to --sandbox workspace-write
|
--dangerously-bypass-approvals-and-sandbox
| Skip all confirmations and sandbox protections, extremely dangerous |
| Specify the working directory |
| Allow running in non-git directories |
| Additional writable directory (can be repeated) |
Model and Configuration
| Flag | Description |
|---|
| Specify the model, explicit passing is recommended |
| Use the profile in |
| Override configuration items in |
| Enable feature flags (can be repeated) |
| Disable feature flags (can be repeated) |
| Use local open-source model providers |
--local-provider PROVIDER
| Specify local providers (e.g., / ) |
Input
| Flag / Method | Description |
|---|
| Attach images (can be repeated) |
| Pass short tasks directly as command-line parameters; only recommended for short prompts |
| or stdin | Do not pass a prompt, or set the prompt to to read from stdin; prefer this method for long prompts, Markdown, or multi-line structures |
codex exec resume Parameters
| Flag | Description |
|---|
| Output in JSONL format |
| Specify the model |
| Shortcut equivalent to sandbox |
--dangerously-bypass-approvals-and-sandbox
| Skip confirmations and sandbox protections |
| Allow running in non-git directories |
| Do not persist the session |
| Attach images |
-o, --output-last-message FILE
| Write the last message to a file |
| Override configuration items in |
| Enable features (can be repeated) |
| Disable features (can be repeated) |
| Resume the most recent session (no need to specify ID) |
| Search all sessions (not limited to current directory) |
codex exec review Parameters
Built-in code review subcommand for reviewing the current repository:
bash
codex exec review [OPTIONS] [PROMPT]
| Flag | Description |
|---|
| Review staged, unstaged, and untracked changes |
| Compare against the specified base branch |
| Review changes introduced by the specified commit |
| Title displayed in the review summary |
| Specify the model |
| Output in JSONL format |
| Shortcut equivalent to sandbox |
| Do not persist the session |
-o, --output-last-message FILE
| Write the last message to a file |
Multi-turn Conversations
- Run for the first time to get
- Use
codex exec resume --json "thread_id" "prompt"
for follow-up questions
- is tracked automatically, no need for users to manage it
- Create new sessions for different tasks; multiple s do not interfere with each other
Model Selection
Explicitly specify
based on task complexity:
| Task Complexity | Model | Applicable Scenarios |
|---|
| High | | Architecture design, complex refactoring, multi-file coding |
| Medium | | Single-file feature implementation, bug fixes |
| Low | | Simple Q&A, code explanation |
Recommended Parameter Combinations
| Scenario | Model | Sandbox | Other Flags |
|---|
| Complex Coding | | | |
| General Coding | | | |
| Read-only Q&A / Analysis | | | (when in non-git directories) |
| Browser Research / Computer Use | | | -C "$PWD" -o /tmp/result.txt
, add if event streams are needed |
| Code Review | | | codex exec --json ... "Review ..."
|
| Repository Review | | — | codex exec review --base main
|
| Quick Q&A | | | --skip-git-repo-check --ephemeral
(⚠️ Non-recoverable) |
| Structured Output | | | --output-schema schema.json -o result.json
|
Usage Rules
- Always add for automated calls: Ensure output is parsable to extract and response content
- Always explicitly pass : Avoid default model drift
- Always run in the target project directory: Prefer or
- Use for coding tasks: Usually directly use
--sandbox workspace-write
or
- Use or for review tasks: Prevent accidental modifications
- Maintain conversation continuity: Reuse for follow-ups on the same task; do not add if follow-up may be needed
- Use + for stable downstream parsing: Constrain the final result into a machine-consumable structure
- Use stdin pipe for long prompts: Do not use positional parameters when prompts exceed ~500 characters or contain multi-line/special characters; write to a file first, then pass via to avoid getting stuck at
Reading additional input from stdin...
- Report results to users: After each call, extract the final response from JSONL and summarize it briefly for users
- Distinguish between and responsibilities: is responsible for saving the last message to a file; is responsible for printing the entire event stream to stdout. Scripts often use both together.
- Prefer for non-coding native browser tasks: If you only want Codex to use Computer Use to open Chrome, browse web pages, and summarize content, is not needed; add
Do not modify local files.
to the prompt as a double safeguard.
Prompt References
Load corresponding references based on task types; do not load all default prompts into the main context at once:
- Coding / Diagnosis / Planning / Narrow Fixes: Read references/task-prompt-recipes.md
- Code Review / Challenging Review / Test Gap Check: Read references/review-prompt-recipes.md
- Native Browser Research / Reddit or Community Sampling / Evidence-based Summary: Read references/browser-research-prompt-recipes.md
These references provide reusable or slightly modifiable default prompt templates; prioritize copying the closest template, then remove unnecessary blocks.
Examples
Coding Task
User: Use Codex to implement a TODO API in the current project
Step 1 - Create a new session:
cd /path/to/project && codex exec --json --full-auto --model gpt-5.4 "Implement a REST API for TODO items with CRUD endpoints. Use Express.js."
→ Parse output to get thread_id: "xxx", response: "Implemented server.js ..."
User: Add unit tests
Step 2 - Resume the session:
cd /path/to/project && codex exec resume --json --model gpt-5.4-mini "xxx" "Add unit tests for all the TODO API endpoints using vitest."
Resume the Most Recent Session
bash
cd /path/to/project && codex exec resume --json --model gpt-5.4-mini --last "Continue the refactor and remove the dead helper functions."
Suitable for scenarios like "continue the previous task" where you don't want to manually save
.
Code Review
bash
# General read-only review
cd /path/to/project && codex exec --json --sandbox read-only --model gpt-5.4-mini "Review the changes in git diff HEAD~1. Focus on correctness, security, and missing tests."
# Built-in review: Compare with main
cd /path/to/project && codex exec review --json --model gpt-5.4-mini --base main
# Review uncommitted changes
cd /path/to/project && codex exec review --json --model gpt-5.4-mini --uncommitted
Structured Output and Write to File
bash
cd /path/to/project && codex exec --json --sandbox read-only --model gpt-5.4-mini \
--output-schema ./review-schema.json \
-o /tmp/review-result.json \
"Review src/todo.ts and output summary, risks, and suggested tests."
Suitable for scenarios where results need to be fed to scripts, CI, or other agents.
Native Browser Research (Only Final Answer Needed)
bash
codex exec \
-m gpt-5.4 \
--sandbox read-only \
--skip-git-repo-check \
-C "$PWD" \
-o /tmp/codex-last.txt \
"Use Computer Use on my Mac. Open Google Chrome, go to Reddit, search for 'Duolingo review', open 3 representative posts (one positive, one negative, one long-term review), then summarize the findings in Chinese. Do not modify local files."
Suitable for manually viewing the final conclusion without caring about intermediate event streams.
Native Browser Research (Both Event Stream and Final Answer Persistence)
bash
codex exec \
-m gpt-5.4 \
--sandbox read-only \
--skip-git-repo-check \
-C "$PWD" \
--json \
-o /tmp/codex-last.txt \
"Use Computer Use on my Mac. Open Google Chrome, search Reddit for Duolingo reviews, open a few representative posts, and then summarize them in Chinese. Do not modify local files."
Suitable for scripts or upper-level agents: read JSONL event stream from stdout, and read the final natural language conclusion from
.
Image Input
bash
cd /path/to/project && codex exec --json --sandbox read-only --model gpt-5.4-mini \
-i ./screenshots/login-bug.png \
"Describe the UI issue in this screenshot and propose a minimal fix plan."
Suitable for visual regression, error screenshot diagnosis, and design draft difference analysis.
Long Task Prompt (Recommended)
bash
# 1. Write complex prompt to a file
cat > /tmp/task-prompt.txt <<'PROMPT_EOF'
Please complete the following tasks in the current repository:
1. First read the README and tests
2. Only modify files directly related to the issue
3. First add tests, then modify the implementation, and finally run verification
PROMPT_EOF
# 2. Read from stdin using - to avoid stuck long positional parameters
codex exec --json --sandbox workspace-write --full-auto --model gpt-5.4 \
--skip-git-repo-check - < /tmp/task-prompt.txt > /tmp/task-out.jsonl 2>&1
Suitable for market research, multi-paragraph Markdown constraints, script-assembled prompts, or any task description that exceeds one screen.
Pass Long Prompt via stdin
bash
cat ./prompt.md | codex exec --json --sandbox workspace-write --model gpt-5.4 -
Suitable for long prompts, templated prompts, or dynamically assembled instructions fed directly to Codex via pipe. For long-term reuse in shell scripts, prefer the
approach from the previous section for better readability and easier troubleshooting.