worker-benchmarks

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Worker Benchmarks Skill

Worker基准测试Skill

Run comprehensive performance benchmarks for the agentic-flow worker system.
运行agentic-flow worker系统的全面性能基准测试。

Quick Start

快速开始

bash
undefined
bash
undefined

Run full benchmark suite

运行完整基准测试套件

npx agentic-flow workers benchmark
npx agentic-flow workers benchmark

Run specific benchmark

运行特定基准测试

npx agentic-flow workers benchmark --type trigger-detection npx agentic-flow workers benchmark --type registry npx agentic-flow workers benchmark --type agent-selection npx agentic-flow workers benchmark --type concurrent
undefined
npx agentic-flow workers benchmark --type trigger-detection npx agentic-flow workers benchmark --type registry npx agentic-flow workers benchmark --type agent-selection npx agentic-flow workers benchmark --type concurrent
undefined

Benchmark Types

基准测试类型

1. Trigger Detection (
trigger-detection
)

1. 触发检测(
trigger-detection

Tests keyword detection speed across 12 worker triggers.
  • Target: p95 < 5ms
  • Iterations: 1000
  • Metrics: latency, throughput, histogram
测试12个worker触发器的关键词检测速度。
  • 目标:p95 < 5ms
  • 迭代次数:1000
  • 指标:延迟、吞吐量、直方图

2. Worker Registry (
registry
)

2. Worker注册表(
registry

Tests CRUD operations on worker entries.
  • Target: p95 < 10ms
  • Iterations: 500 creates, gets, updates
  • Metrics: per-operation latency breakdown
测试worker条目的CRUD操作。
  • 目标:p95 < 10ms
  • 迭代次数:500次创建、查询、更新
  • 指标:各操作延迟细分

3. Agent Selection (
agent-selection
)

3. Agent选择(
agent-selection

Tests performance-based agent selection.
  • Target: p95 < 1ms
  • Iterations: 1000
  • Metrics: selection confidence, agent scores
测试基于性能的Agent选择。
  • 目标:p95 < 1ms
  • 迭代次数:1000
  • 指标:选择置信度、Agent评分

4. Model Cache (
cache
)

4. 模型缓存(
cache

Tests model caching performance.
  • Target: p95 < 0.5ms
  • Metrics: hit rate, cache size, eviction stats
测试模型缓存性能。
  • 目标:p95 < 0.5ms
  • 指标:命中率、缓存大小、淘汰统计

5. Concurrent Workers (
concurrent
)

5. 并发Worker(
concurrent

Tests parallel worker creation and updates.
  • Target: < 1000ms for 10 workers
  • Metrics: per-worker latency, memory usage
测试并行worker的创建与更新。
  • 目标:10个worker耗时 < 1000ms
  • 指标:单worker延迟、内存使用

6. Memory Key Generation (
memory-keys
)

6. 内存键生成(
memory-keys

Tests memory pattern key generation.
  • Target: p95 < 0.1ms
  • Iterations: 5000
  • Metrics: unique patterns, throughput
测试内存模式键生成。
  • 目标:p95 < 0.1ms
  • 迭代次数:5000
  • 指标:唯一模式、吞吐量

Output Format

输出格式

═══════════════════════════════════════════════════════════
📈 BENCHMARK RESULTS
═══════════════════════════════════════════════════════════

✅ Trigger Detection
   Operation: detect
   Count: 1,000
   Avg: 0.045ms | p95: 0.120ms (target: 5ms)
   Throughput: 22,222 ops$s
   Memory Δ: 0.12MB

✅ Worker Registry
   Operation: crud
   Count: 1,500
   Avg: 1.234ms | p95: 3.456ms (target: 10ms)
   Throughput: 810 ops$s
   Memory Δ: 2.34MB

───────────────────────────────────────────────────────────
📊 SUMMARY
───────────────────────────────────────────────────────────
Total Tests: 6
Passed: 6 | Failed: 0
Avg Latency: 0.567ms
Total Duration: 2345ms
Peak Memory: 8.90MB
═══════════════════════════════════════════════════════════
═══════════════════════════════════════════════════════════
📈 BENCHMARK RESULTS
═══════════════════════════════════════════════════════════

✅ Trigger Detection
   Operation: detect
   Count: 1,000
   Avg: 0.045ms | p95: 0.120ms (target: 5ms)
   Throughput: 22,222 ops$s
   Memory Δ: 0.12MB

✅ Worker Registry
   Operation: crud
   Count: 1,500
   Avg: 1.234ms | p95: 3.456ms (target: 10ms)
   Throughput: 810 ops$s
   Memory Δ: 2.34MB

───────────────────────────────────────────────────────────
📊 SUMMARY
───────────────────────────────────────────────────────────
Total Tests: 6
Passed: 6 | Failed: 0
Avg Latency: 0.567ms
Total Duration: 2345ms
Peak Memory: 8.90MB
═══════════════════════════════════════════════════════════

Integration with Settings

与设置集成

Benchmark thresholds are configured in
.claude$settings.json
:
json
{
  "performance": {
    "benchmarkThresholds": {
      "triggerDetection": { "p95Ms": 5 },
      "workerRegistry": { "p95Ms": 10 },
      "agentSelection": { "p95Ms": 1 },
      "memoryKeyGeneration": { "p95Ms": 0.1 },
      "concurrentWorkers": { "totalMs": 1000 }
    }
  }
}
基准测试阈值在
.claude$settings.json
中配置:
json
{
  "performance": {
    "benchmarkThresholds": {
      "triggerDetection": { "p95Ms": 5 },
      "workerRegistry": { "p95Ms": 10 },
      "agentSelection": { "p95Ms": 1 },
      "memoryKeyGeneration": { "p95Ms": 0.1 },
      "concurrentWorkers": { "totalMs": 1000 }
    }
  }
}

Programmatic Usage

程序化使用

typescript
import { workerBenchmarks, runBenchmarks } from 'agentic-flow$workers$worker-benchmarks';

// Run full suite
const suite = await runBenchmarks();
console.log(suite.summary);

// Run individual benchmarks
const triggerResult = await workerBenchmarks.benchmarkTriggerDetection(1000);
const registryResult = await workerBenchmarks.benchmarkRegistryOperations(500);
typescript
import { workerBenchmarks, runBenchmarks } from 'agentic-flow$workers$worker-benchmarks';

// 运行完整套件
const suite = await runBenchmarks();
console.log(suite.summary);

// 运行单个基准测试
const triggerResult = await workerBenchmarks.benchmarkTriggerDetection(1000);
const registryResult = await workerBenchmarks.benchmarkRegistryOperations(500);

Performance Optimization Tips

性能优化技巧

  1. Model Cache: Enable with
    CLAUDE_FLOW_MODEL_CACHE_MB=512
  2. Parallel Workers: Enable with
    CLAUDE_FLOW_WORKER_PARALLEL=true
  3. Warning Suppression: Enable with
    CLAUDE_FLOW_SUPPRESS_WARNINGS=true
  4. SQLite WAL Mode: Automatic for better concurrent performance
  1. 模型缓存:通过
    CLAUDE_FLOW_MODEL_CACHE_MB=512
    启用
  2. 并行Worker:通过
    CLAUDE_FLOW_WORKER_PARALLEL=true
    启用
  3. 警告抑制:通过
    CLAUDE_FLOW_SUPPRESS_WARNINGS=true
    启用
  4. SQLite WAL模式:自动启用以提升并发性能