accelerate

Original：🇺🇸 English

Translated

Use when the workflow is too slow, too expensive, or both and needs latency, cost, or token usage optimization.

20installs

Sourcesharpdeveye/maestro

Added on2026-04-11

NPX Install

npx skill4agent add sharpdeveye/maestro accelerate

SKILL.md Content

View Translation Comparison →

MANDATORY PREPARATION

Invoke {{command_prefix}}agent-workflow — it contains workflow principles, anti-patterns, and the Context Gathering Protocol. Follow the protocol before proceeding — if no workflow context exists yet, you MUST run {{command_prefix}}teach-maestro first. Consult the context-management reference in the agent-workflow skill for window optimization and budget strategies.

Make the workflow faster and cheaper without sacrificing quality. Measure before and after.

Performance Audit

Measure current performance:

text

Current metrics:
  Latency (p50): ___ms
  Latency (p95): ___ms
  Cost per request: $___
  Token usage (avg): ___ input / ___ output
  Error rate: ___%

Acceleration Strategies

Reduce Token Usage

Shorten system prompts (remove redundant instructions)
Compress few-shot examples to minimum viable length
Use structured output schemas instead of verbose text
Summarize context instead of passing raw documents
Reduce output length requirements

Model Cascading

Route simple tasks to cheaper/faster models
Escalate only complex tasks to capable models
Use classification to determine complexity

Caching

Cache responses for identical or near-identical inputs
Cache tool results with appropriate TTL
Cache embeddings for frequently-queried documents
Use semantic caching for similar (not identical) queries

Parallelization

Run independent tool calls in parallel
Run independent agent steps in parallel
Use streaming to start processing before full response

Context Optimization

Retrieve less, retrieve better (improve retrieval precision)
Use context compression techniques
Implement sliding window for long conversations

Acceleration Report

For each optimization:

What changed: Specific modification
Before: Latency/cost/tokens before
After: Latency/cost/tokens after
Quality impact: Any quality change (verify with golden tests)
Trade-off: What was sacrificed for the improvement

Acceleration Checklist

Baseline metrics recorded before any changes
Each optimization measured with before/after comparison
Quality impact verified (golden tests still pass)
Trade-offs documented for each change
Cost/latency improvements quantified

Recommended Next Step

After optimization, run

{{command_prefix}}evaluate

to verify quality didn't degrade, or

{{command_prefix}}iterate

to set up continuous monitoring.

NEVER:

Optimize without measuring first (you need a baseline)
Sacrifice quality for speed without explicit user approval
Cache outputs that depend on real-time data
Skip the quality check after optimization
Optimize prematurely (make it correct first, then make it fast)