auto-claude-optimization
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAuto-Claude Optimization
Auto-Claude优化
Performance tuning, cost reduction, and efficiency improvements.
性能调优、成本降低与效率提升。
Performance Overview
性能概览
Key Metrics
关键指标
| Metric | Impact | Optimization |
|---|---|---|
| API latency | Build speed | Model selection, caching |
| Token usage | Cost | Prompt efficiency, context limits |
| Memory queries | Speed | Embedding model, index tuning |
| Build iterations | Time | Spec quality, QA settings |
| 指标 | 影响 | 优化方向 |
|---|---|---|
| API latency | 构建速度 | 模型选择、缓存 |
| Token usage | 成本 | 提示词效率、上下文限制 |
| Memory queries | 速度 | 嵌入模型、索引调优 |
| Build iterations | 时间 | 规格质量、QA设置 |
Model Optimization
模型优化
Model Selection
模型选择
| Model | Speed | Cost | Quality | Use Case |
|---|---|---|---|---|
| claude-opus-4-5-20251101 | Slow | High | Best | Complex features |
| claude-sonnet-4-5-20250929 | Fast | Medium | Good | Standard features |
bash
undefined| 模型 | 速度 | 成本 | 质量 | 适用场景 |
|---|---|---|---|---|
| claude-opus-4-5-20251101 | 慢 | 高 | 最佳 | 复杂功能 |
| claude-sonnet-4-5-20250929 | 快 | 中等 | 良好 | 标准功能 |
bash
undefinedOverride model in .env
Override model in .env
AUTO_BUILD_MODEL=claude-sonnet-4-5-20250929
undefinedAUTO_BUILD_MODEL=claude-sonnet-4-5-20250929
undefinedExtended Thinking Tokens
扩展思考Token
Configure thinking budget per agent:
| Agent | Default | Recommended |
|---|---|---|
| Spec creation | 16000 | Keep default for quality |
| Planning | 5000 | Reduce to 3000 for speed |
| Coding | 0 | Keep disabled |
| QA Review | 10000 | Reduce to 5000 for speed |
python
undefined为每个Agent配置思考预算:
| Agent | 默认值 | 推荐值 |
|---|---|---|
| 规格创建 | 16000 | 保持默认以保证质量 |
| 规划 | 5000 | 降低至3000以提升速度 |
| 编码 | 0 | 保持禁用 |
| QA审核 | 10000 | 降低至5000以提升速度 |
python
undefinedIn agent configuration
In agent configuration
max_thinking_tokens=5000 # or None to disable
undefinedmax_thinking_tokens=5000 # or None to disable
undefinedToken Optimization
Token优化
Reduce Context Size
缩小上下文规模
-
Smaller spec filesbash
# Keep specs concise # Bad: 5000 word spec # Good: 500 word spec with clear criteria -
Limit codebase scanningpython
# In context/builder.py MAX_CONTEXT_FILES = 50 # Reduce from 100 -
Use targeted searchesbash
# Instead of full codebase scan # Focus on relevant directories
-
更简洁的规格文件bash
# Keep specs concise # Bad: 5000 word spec # Good: 500 word spec with clear criteria -
限制代码库扫描范围python
# In context/builder.py MAX_CONTEXT_FILES = 50 # Reduce from 100 -
使用针对性搜索bash
# Instead of full codebase scan # Focus on relevant directories
Efficient Prompts
高效提示词
Optimize system prompts in :
apps/backend/prompts/markdown
<!-- Bad: Verbose -->
You are an expert software developer who specializes in building
high-quality, production-ready applications. You have extensive
experience with many programming languages and frameworks...
<!-- Good: Concise -->
Expert full-stack developer. Build production-quality code.
Follow existing patterns. Test thoroughly.优化中的系统提示词:
apps/backend/prompts/markdown
<!-- Bad: Verbose -->
You are an expert software developer who specializes in building
high-quality, production-ready applications. You have extensive
experience with many programming languages and frameworks...
<!-- Good: Concise -->
Expert full-stack developer. Build production-quality code.
Follow existing patterns. Test thoroughly.Memory Optimization
内存优化
bash
undefinedbash
undefinedUse efficient embedding model
Use efficient embedding model
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
Or offline with smaller model
Or offline with smaller model
OLLAMA_EMBEDDING_MODEL=all-minilm
OLLAMA_EMBEDDING_DIM=384
undefinedOLLAMA_EMBEDDING_MODEL=all-minilm
OLLAMA_EMBEDDING_DIM=384
undefinedSpeed Optimization
速度优化
Parallel Execution
并行执行
bash
undefinedbash
undefinedEnable more parallel agents (default: 4)
Enable more parallel agents (default: 4)
MAX_PARALLEL_AGENTS=8
undefinedMAX_PARALLEL_AGENTS=8
undefinedReduce QA Iterations
减少QA迭代次数
bash
undefinedbash
undefinedLimit QA loop iterations
Limit QA loop iterations
MAX_QA_ITERATIONS=10 # Default: 50
MAX_QA_ITERATIONS=10 # Default: 50
Skip QA for quick iterations
Skip QA for quick iterations
python run.py --spec 001 --skip-qa
undefinedpython run.py --spec 001 --skip-qa
undefinedFaster Spec Creation
更快的规格创建
bash
undefinedbash
undefinedForce simple complexity for quick tasks
Force simple complexity for quick tasks
python spec_runner.py --task "Fix typo" --complexity simple
python spec_runner.py --task "Fix typo" --complexity simple
Skip research phase
Skip research phase
SKIP_RESEARCH_PHASE=true python spec_runner.py --task "..."
undefinedSKIP_RESEARCH_PHASE=true python spec_runner.py --task "..."
undefinedAPI Timeout Tuning
API超时调优
bash
undefinedbash
undefinedReduce timeout for faster failure detection
Reduce timeout for faster failure detection
API_TIMEOUT_MS=120000 # 2 minutes (default: 10 minutes)
undefinedAPI_TIMEOUT_MS=120000 # 2 minutes (default: 10 minutes)
undefinedCost Management
成本管理
Monitor Token Usage
监控Token使用情况
bash
undefinedbash
undefinedEnable cost tracking
Enable cost tracking
ENABLE_COST_TRACKING=true
ENABLE_COST_TRACKING=true
View usage report
View usage report
python usage_report.py --spec 001
undefinedpython usage_report.py --spec 001
undefinedCost Reduction Strategies
成本降低策略
-
Use cheaper models for simple tasksbash
# For simple specs AUTO_BUILD_MODEL=claude-sonnet-4-5-20250929 python spec_runner.py --task "..." -
Limit context windowbash
MAX_CONTEXT_TOKENS=50000 # Reduce from 100000 -
Batch similar tasksbash
# Create specs together, run together python spec_runner.py --task "Add feature A" python spec_runner.py --task "Add feature B" python run.py --spec 001 python run.py --spec 002 -
Use local models for memorybash
# Ollama for memory (free) GRAPHITI_LLM_PROVIDER=ollama GRAPHITI_EMBEDDER_PROVIDER=ollama
-
简单任务使用更便宜的模型bash
# For simple specs AUTO_BUILD_MODEL=claude-sonnet-4-5-20250929 python spec_runner.py --task "..." -
限制上下文窗口bash
MAX_CONTEXT_TOKENS=50000 # Reduce from 100000 -
批量处理相似任务bash
# Create specs together, run together python spec_runner.py --task "Add feature A" python spec_runner.py --task "Add feature B" python run.py --spec 001 python run.py --spec 002 -
使用本地模型处理内存bash
# Ollama for memory (free) GRAPHITI_LLM_PROVIDER=ollama GRAPHITI_EMBEDDER_PROVIDER=ollama
Cost Estimation
成本估算
| Operation | Estimated Tokens | Cost (Opus) | Cost (Sonnet) |
|---|---|---|---|
| Simple spec | 10k | ~$0.30 | ~$0.06 |
| Standard spec | 50k | ~$1.50 | ~$0.30 |
| Complex spec | 200k | ~$6.00 | ~$1.20 |
| Build (simple) | 50k | ~$1.50 | ~$0.30 |
| Build (standard) | 200k | ~$6.00 | ~$1.20 |
| Build (complex) | 500k | ~$15.00 | ~$3.00 |
| 操作 | 预估Token量 | 成本(Opus) | 成本(Sonnet) |
|---|---|---|---|
| 简单规格 | 10k | ~$0.30 | ~$0.06 |
| 标准规格 | 50k | ~$1.50 | ~$0.30 |
| 复杂规格 | 200k | ~$6.00 | ~$1.20 |
| 构建(简单) | 50k | ~$1.50 | ~$0.30 |
| 构建(标准) | 200k | ~$6.00 | ~$1.20 |
| 构建(复杂) | 500k | ~$15.00 | ~$3.00 |
Memory System Optimization
内存系统优化
Embedding Performance
嵌入性能
bash
undefinedbash
undefinedFaster embeddings
Faster embeddings
OPENAI_EMBEDDING_MODEL=text-embedding-3-small # 1536 dim, fast
OPENAI_EMBEDDING_MODEL=text-embedding-3-small # 1536 dim, fast
Higher quality (slower)
Higher quality (slower)
OPENAI_EMBEDDING_MODEL=text-embedding-3-large # 3072 dim
OPENAI_EMBEDDING_MODEL=text-embedding-3-large # 3072 dim
Offline (fastest, free)
Offline (fastest, free)
OLLAMA_EMBEDDING_MODEL=all-minilm
OLLAMA_EMBEDDING_DIM=384
undefinedOLLAMA_EMBEDDING_MODEL=all-minilm
OLLAMA_EMBEDDING_DIM=384
undefinedQuery Optimization
查询优化
python
undefinedpython
undefinedLimit search results
Limit search results
memory.search("query", limit=10) # Instead of 100
memory.search("query", limit=10) # Instead of 100
Use semantic caching
Use semantic caching
ENABLE_MEMORY_CACHE=true
undefinedENABLE_MEMORY_CACHE=true
undefinedDatabase Maintenance
数据库维护
bash
undefinedbash
undefinedCompact database periodically
Compact database periodically
python -c "from integrations.graphiti.memory import compact_database; compact_database()"
python -c "from integrations.graphiti.memory import compact_database; compact_database()"
Clear old episodes
Clear old episodes
python query_memory.py --cleanup --older-than 30d
undefinedpython query_memory.py --cleanup --older-than 30d
undefinedBuild Efficiency
构建效率
Spec Quality = Build Speed
规格质量 = 构建速度
High-quality specs reduce iterations:
markdown
undefined高质量规格可减少迭代次数:
markdown
undefinedGood spec (fewer iterations)
Good spec (fewer iterations)
Acceptance Criteria
Acceptance Criteria
- User can log in with email/password
- Invalid credentials show error message
- Successful login redirects to /dashboard
- Session persists for 24 hours
- User can log in with email/password
- Invalid credentials show error message
- Successful login redirects to /dashboard
- Session persists for 24 hours
Bad spec (more iterations)
Bad spec (more iterations)
Acceptance Criteria
Acceptance Criteria
- Login works
undefined- Login works
undefinedSubtask Granularity
子任务粒度
Optimal subtask size:
- Too large: Agent gets stuck, needs recovery
- Too small: Overhead per subtask
- Optimal: 30-60 minutes of work each
最优子任务规模:
- 过大:Agent陷入停滞,需要恢复
- 过小:子任务 overhead 过高
- 最优:每个子任务对应30-60分钟工作量
Parallel Work
并行工作
Let agents spawn subagents for parallel execution:
Main Coder
├── Subagent 1: Frontend (parallel)
├── Subagent 2: Backend (parallel)
└── Subagent 3: Tests (parallel)让Agent生成子Agent以并行执行:
Main Coder
├── Subagent 1: Frontend (parallel)
├── Subagent 2: Backend (parallel)
└── Subagent 3: Tests (parallel)Environment Tuning
环境调优
Optimal .env Configuration
最优.env配置
bash
undefinedbash
undefinedPerformance-focused configuration
Performance-focused configuration
AUTO_BUILD_MODEL=claude-sonnet-4-5-20250929
API_TIMEOUT_MS=180000
MAX_PARALLEL_AGENTS=6
AUTO_BUILD_MODEL=claude-sonnet-4-5-20250929
API_TIMEOUT_MS=180000
MAX_PARALLEL_AGENTS=6
Memory optimization
Memory optimization
GRAPHITI_LLM_PROVIDER=ollama
GRAPHITI_EMBEDDER_PROVIDER=ollama
OLLAMA_LLM_MODEL=llama3.2:3b
OLLAMA_EMBEDDING_MODEL=all-minilm
OLLAMA_EMBEDDING_DIM=384
GRAPHITI_LLM_PROVIDER=ollama
GRAPHITI_EMBEDDER_PROVIDER=ollama
OLLAMA_LLM_MODEL=llama3.2:3b
OLLAMA_EMBEDDING_MODEL=all-minilm
OLLAMA_EMBEDDING_DIM=384
Reduce verbosity
Reduce verbosity
DEBUG=false
ENABLE_FANCY_UI=false
undefinedDEBUG=false
ENABLE_FANCY_UI=false
undefinedResource Limits
资源限制
bash
undefinedbash
undefinedLimit Python memory
Limit Python memory
export PYTHONMALLOC=malloc
export PYTHONMALLOC=malloc
Set max file descriptors
Set max file descriptors
ulimit -n 4096
undefinedulimit -n 4096
undefinedBenchmarking
基准测试
Measure Build Time
测量构建时间
bash
undefinedbash
undefinedTime a build
Time a build
time python run.py --spec 001
time python run.py --spec 001
Compare models
Compare models
time AUTO_BUILD_MODEL=claude-opus-4-5-20251101 python run.py --spec 001
time AUTO_BUILD_MODEL=claude-sonnet-4-5-20250929 python run.py --spec 001
undefinedtime AUTO_BUILD_MODEL=claude-opus-4-5-20251101 python run.py --spec 001
time AUTO_BUILD_MODEL=claude-sonnet-4-5-20250929 python run.py --spec 001
undefinedProfile Memory Usage
分析内存使用
bash
undefinedbash
undefinedMonitor memory
Monitor memory
watch -n 1 'ps aux | grep python | head -5'
watch -n 1 'ps aux | grep python | head -5'
Profile script
Profile script
python -m cProfile -o profile.stats run.py --spec 001
python -c "import pstats; p = pstats.Stats('profile.stats'); p.sort_stats('cumulative').print_stats(20)"
undefinedpython -m cProfile -o profile.stats run.py --spec 001
python -c "import pstats; p = pstats.Stats('profile.stats'); p.sort_stats('cumulative').print_stats(20)"
undefinedQuick Wins
快速优化方案
Immediate Optimizations
即时优化措施
-
Switch to Sonnet for most tasksbash
AUTO_BUILD_MODEL=claude-sonnet-4-5-20250929 -
Use Ollama for memorybash
GRAPHITI_LLM_PROVIDER=ollama GRAPHITI_EMBEDDER_PROVIDER=ollama -
Skip QA for prototypesbash
python run.py --spec 001 --skip-qa -
Force simple complexity for small tasksbash
python spec_runner.py --task "..." --complexity simple
-
大部分任务切换为Sonnet模型bash
AUTO_BUILD_MODEL=claude-sonnet-4-5-20250929 -
使用Ollama处理内存bash
GRAPHITI_LLM_PROVIDER=ollama GRAPHITI_EMBEDDER_PROVIDER=ollama -
原型开发跳过QAbash
python run.py --spec 001 --skip-qa -
小型任务强制设置为简单复杂度bash
python spec_runner.py --task "..." --complexity simple
Medium-Term Improvements
中期改进措施
- Optimize prompts in
apps/backend/prompts/ - Configure project-specific security allowlist
- Set up memory caching
- Tune parallel agent count
- 优化中的提示词
apps/backend/prompts/ - 配置项目专属安全白名单
- 启用内存缓存
- 调整并行Agent数量
Long-Term Strategies
长期策略
- Self-hosted LLM for memory (Ollama)
- Caching layer for common operations
- Incremental context building
- Project-specific prompt optimization
- 自托管LLM处理内存(Ollama)
- 为常见操作添加缓存层
- 增量式上下文构建
- 项目专属提示词优化
Related Skills
相关技能
- auto-claude-memory: Memory configuration
- auto-claude-build: Build process
- auto-claude-troubleshooting: Debugging
- auto-claude-memory: 内存配置
- auto-claude-build: 构建流程
- auto-claude-troubleshooting: 调试