Loading...
Loading...
Found 8 Skills
Implements the NOWAIT technique for efficient reasoning in R1-style LLMs. Use when optimizing inference of reasoning models (QwQ, DeepSeek-R1, Phi4-Reasoning, Qwen3, Kimi-VL, QvQ), reducing chain-of-thought token usage by 27-51% while preserving accuracy. Triggers on "optimize reasoning", "reduce thinking tokens", "efficient inference", "suppress reflection tokens", or when working with verbose CoT outputs.
Use when auditing, trimming, or restructuring AI skill files to reduce context-window consumption. Trigger whenever a SKILL.md exceeds 120 lines, skills share duplicated content, AGENTS.md has large inline blocks, or the user asks to optimize, slim down, or reduce token usage of their skills.
Compress documentation, prompts, and context into minimal tokens for AGENTS.md and CLAUDE.md. Achieves 80%+ token reduction while preserving agent accuracy.
Optimizes Claude Code memory files in 4 interactive steps: removes duplicates, migrates rules to CLAUDE.md/rules files, compresses remaining entries, validates with cleanup. Typical reduction: 30-50% on token count.
Optimize command outputs with RTK (Rust Token Killer) for 70% token reduction
Activate when the user asks Claude to talk like a caveman, use caveman mode, say "less tokens please", or invoke "/elastic-caveman". Also activate when the user wants faster, terser responses while still working with Elasticsearch, Kibana, Elastic Security, Elastic Observability, or any part of the Elastic stack. In caveman mode all Elasticsearch-specific technical terms, API names, field names, index patterns, query DSL structures, ESQL syntax, and error messages are preserved verbatim — only filler words and pleasantries are removed. Stop caveman mode when the user says "stop caveman" or "normal mode".
Use when optimizing agent context, reducing token costs, implementing KV-cache optimization, or asking about "context optimization", "token reduction", "context limits", "observation masking", "context budgeting", "context partitioning"
Compress natural language memory files (CLAUDE.md, todos, settings) into "primitive human" format to reduce input tokens. Fully retain technical content, code, URLs, and structure. The compressed version overwrites the original file, and the human-readable version is saved as FILE.original.md. Trigger with `/genshijin-compress <filepath>` or requests like "memory file compression".