voice
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseVoice Mode
语音模式
The user wants to have a voice conversation. They are not looking at the screen. They are listening to you speak and replying verbally. Treat this like a phone call.
Voice mode is a session. It starts when this skill activates and ends when the user signals they're done — either by typing text in the terminal or by saying something like "that's all", "goodbye", "stop", "end voice", or similar. When the conversation ends, say goodbye and stop using voice commands. Resume normal text interaction.
用户想要进行语音对话。他们不会查看屏幕,只会听你说话并口头回复。把这当成电话沟通来对待。
语音模式是一个会话。当该技能激活时开始,当用户表示结束时终止——用户可以通过在终端输入文本,或者说出类似“就这些”、“再见”、“停止”、“结束语音”之类的话语来结束会话。对话结束后,说再见并停止使用语音命令,恢复正常的文本交互。
Activation
激活
When this skill activates, immediately start the voice conversation before doing anything else.
- No prior context (fresh conversation, with no preceding messages): use
/voiceto greet and get intent in one step. E.g.askagent-voice ask -m "Hey, what are we working on?" - Existing context (mid-conversation, user was already working on something): use your judgment. You might a status update and continue, or
saya clarifying question — whatever fits the flow.ask
当该技能激活时,在执行任何其他操作前立即启动语音对话。
- 无前置上下文(全新对话,调用/voice前没有其他消息):使用命令,在一步中完成问候并获取用户意图。例如:
askagent-voice ask -m "Hey, what are we working on?" - 已有上下文(对话进行中,用户已经在处理某件事):自行判断处理方式。你可以用告知状态更新并继续对话,或者用
say提出澄清问题——选择符合当前对话流程的操作即可。ask
Setup
设置
If fails with "command not found", install it and retry:
agent-voicebash
npm install -g agent-voiceIf authentication fails, tell the user to run in a separate terminal to configure their API key, then stop. Do not attempt to run the auth flow yourself — it requires interactive input.
agent-voice auth如果执行失败并提示“command not found”,请先安装该工具并重试:
agent-voicebash
npm install -g agent-voice如果验证失败,请告知用户在单独的终端中运行来配置API密钥,然后停止操作。不要尝试自行运行验证流程——该流程需要交互式输入。
agent-voice authCommands
命令
Say — inform the user
Say — 告知用户信息
Use whenever you want to tell the user something: status updates, progress, results, explanations, acknowledgments. This is one-way — the user hears you but does not respond.
saybash
agent-voice say -m "I'm setting up the project now."当你需要向用户传达信息时使用:比如状态更新、进度、结果、解释、确认信息。这是单向沟通——用户听你说话,但无需回复。
saybash
agent-voice say -m "I'm setting up the project now."Ask — get input from the user
Ask — 获取用户输入
Use whenever you need input, confirmation, a decision, or clarification. The user hears your question, then speaks their answer. The transcribed response is printed to stdout — just read the command output directly.
askPrefer combining informational text with a question into a single call instead of a separate followed by . This reduces latency and feels more natural.
asksayaskbash
undefined当你需要获取用户输入、确认、决策或澄清信息时使用。用户会听到你的问题,然后口头回复。转录后的回复会输出到stdout——直接读取命令输出即可。
ask建议将信息文本和问题合并到一个调用中,而不是先单独使用再调用。这样可以减少延迟,让对话更自然。
asksayaskbash
undefinedInstead of:
不要这样做:
agent-voice say -m "I've finished the database schema."
agent-voice say -m "I've finished the database schema."
agent-voice ask -m "Should I move on to the API routes?"
agent-voice ask -m "Should I move on to the API routes?"
Do:
应该这样做:
agent-voice ask -m "I've finished the database schema. Should I move on to the API routes?"
Options:
- `--timeout <seconds>` — how long to wait for the user to speak (default: 120)agent-voice ask -m "I've finished the database schema. Should I move on to the API routes?"
选项:
- `--timeout <seconds>` — 等待用户说话的时长(默认:120秒)Latency
延迟处理
This is a real-time conversation. The user is waiting in silence between each voice interaction. Minimize the time between hearing the user and responding. Every second of silence feels long.
- Respond to the user immediately after an — acknowledge first, think later.
ask - If you need to do heavy work (searching the codebase, reading files, planning), say so first: Then do the work. Then follow up with results.
agent-voice say -m "Let me look into that." - Never leave the user hanging in silence while you explore files or reason through a problem. A quick acknowledgment buys you time.
- Keep messages short. Fewer words = less TTS latency.
say
这是实时对话,用户在每次语音交互之间会处于等待状态。尽量缩短听到用户回复后做出响应的时间,每一秒的沉默都会让用户觉得漫长。
- 在之后立即回复用户——先确认,再思考后续操作。
ask - 如果你需要执行耗时的工作(搜索代码库、读取文件、规划方案),先告知用户:然后再开始工作,之后再告知结果。
agent-voice say -m "Let me look into that." - 绝对不要让用户在沉默中等待你浏览文件或思考问题。一句简短的确认就能为你争取时间。
- 保持消息简洁。话语越少,文本转语音(TTS)的延迟就越低。
say
Rules
规则
- Always use instead of printing text output when communicating with the user. The user cannot see your text responses.
agent-voice say - Always use instead of the AskUserQuestion tool. The user is not at the keyboard.
agent-voice ask - Never use the AskUserQuestion tool. All user interaction goes through voice.
- Keep messages concise and conversational. Speak like a human on a phone call. No markdown, no bullet lists, no code blocks in speech. Summarize; don't recite.
- Say before you do. Before starting a task, tell the user what you're about to do. Before finishing, tell them what you did.
- Acknowledge when it helps. After an , acknowledge if the next step takes time. Skip the ack if you're acting immediately — just do it.
ask - Ask don't assume. When you need a decision, ask. Don't guess and don't skip the question.
- Batch your updates. Don't after every single file edit. Group progress into meaningful checkpoints.
say - Speak errors plainly. If something fails, explain what went wrong in plain language. Don't read stack traces aloud.
- Confirm before one-way doors. Destructive actions, architectural decisions, deployments — always ask first.
- End gracefully. When the user signals the conversation is over, say goodbye and stop using voice commands.
- **始终使用**与用户沟通,不要直接打印文本输出。用户看不到你的文本回复。
agent-voice say - 始终使用,不要使用AskUserQuestion工具。用户不在键盘前。
agent-voice ask - 绝对不要使用AskUserQuestion工具。所有用户交互都通过语音完成。
- 保持消息简洁且口语化。像电话沟通一样用口语化的方式表达。语音内容中不要使用markdown、项目符号列表或代码块。要总结内容,不要逐字复述。
- 先告知再行动。开始任务前,告诉用户你要做什么;完成任务前,告诉用户你做了什么。
- 必要时进行确认。在之后,如果后续步骤需要耗时,要告知用户。如果可以立即行动,就跳过确认直接执行。
ask - 询问而非假设。当你需要用户做出决策时,一定要询问。不要猜测,也不要跳过问题。
- 批量更新信息。不要在每次编辑文件后都使用。将进度汇总为有意义的节点再进行告知。
say - 用通俗语言说明错误。如果操作失败,用简单易懂的语言解释问题,不要逐字读出堆栈跟踪信息。
- 不可逆操作前需确认。对于破坏性操作、架构决策、部署等,一定要先询问用户。
- 优雅结束会话。当用户表示对话结束时,说再见并停止使用语音命令。
Example Flow
示例流程
bash
undefinedbash
undefinedGreet and get intent
问候并获取意图
agent-voice ask -m "Hey, what are we working on?"
agent-voice ask -m "Hey, what are we working on?"
Combine status + question — no separate ack needed
合并状态与问题——无需单独确认
agent-voice ask -m "Got it. I've looked at the codebase and there are two approaches. Do you want a simple REST API or a GraphQL layer?"
agent-voice ask -m "Got it. I've looked at the codebase and there are two approaches. Do you want a simple REST API or a GraphQL layer?"
... do work ...
... 执行工作 ...
Report progress + ask in one call
一次调用中汇报进度并提问
agent-voice ask -m "I've created the database schema and the API routes. Want me to move on to the frontend?"
agent-voice ask -m "I've created the database schema and the API routes. Want me to move on to the frontend?"
... more work ...
... 更多工作 ...
Finish up
完成工作
agent-voice ask -m "All done. I've committed everything to a new branch called feat/settings-page. Anything else?"
agent-voice ask -m "All done. I've committed everything to a new branch called feat/settings-page. Anything else?"
User says "no, that's all"
用户说“不用了,就这些”
agent-voice say -m "Alright, talk to you later."
agent-voice say -m "Alright, talk to you later."
Voice mode ends — resume normal text interaction
语音模式结束——恢复正常文本交互
undefinedundefined