talktomepy-tts

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

TalkToMePy TTS (Deprecated)

TalkToMePy TTS(已废弃)

Deprecation Status

废弃状态

  • Status: Deprecated
  • Scope: Legacy-only, backward compatibility
  • Successor: Use the speech workflow successor in gaelic-ghost/a11y-skills for new automation and active maintenance.
  • Guidance: Do not choose this skill for new setups unless you explicitly need this older TalkToMePy-specific behavior.
Use this skill when the user asks to hear text spoken aloud from the local machine.
  • 状态:已废弃
  • 适用范围:仅旧版兼容,用于向后兼容
  • 替代方案:对于新的自动化需求和持续维护,使用gaelic-ghost/a11y-skills中的后续语音工作流。
  • 使用指引:除非您明确需要旧版TalkToMePy的特定行为,否则不要在新环境中选择该Skill。
当用户要求从本地机器朗读文本时,可使用该Skill。

Post-Invocation Resolution Rules

调用后解析规则

Apply these rules before synthesis to make speech-source selection deterministic in ambiguous contexts.
在合成前应用这些规则,以在模糊场景中明确语音源的选择。

Invocation detection

调用检测

  • Treat the skill as invoked when the user clearly calls it, including skill chip/link invocation,
    $talktomepy-tts
    , or equivalent direct imperative phrasing.
  • Ignore incidental mention in unrelated prose.
  • 当用户明确调用该Skill时(包括Skill芯片/链接调用、
    $talktomepy-tts
    指令或等效的直接命令表述),视为Skill已被调用。
  • 忽略无关文本中的偶然提及。

Source text precedence

源文本优先级

  1. Suffix invocation pattern:
    • If invocation appears at the end of the user message, speak the user text before the invocation token.
  2. Standalone invocation pattern:
    • If the user message is only the invocation, speak the immediately previous assistant message.
  3. Prefix invocation pattern:
    • If invocation appears at the beginning of a longer user message, speak the immediately previous assistant message.
    • After speaking, continue responding to the remaining user message normally.
  1. 后缀调用模式:
    • 如果调用指令出现在用户消息末尾,朗读调用指令之前的用户文本。
  2. 独立调用模式:
    • 如果用户消息仅包含调用指令,朗读上一条助手消息。
  3. 前缀调用模式:
    • 如果调用指令出现在较长用户消息的开头,朗读上一条助手消息。
    • 朗读完成后,正常响应用户消息的剩余部分。

No-prior-assistant fallback

无前助手消息的 fallback 处理

  • Standalone invocation with no previous assistant message:
    • Explain there is no prior assistant message to read.
    • Ask whether the user wants to provide text, or wants current text spoken.
  • Prefix invocation with no previous assistant message:
    • Explain the chat has no earlier assistant message.
    • Ask whether to speak the current user text.
    • If the user is upset or confused, explain invocation-placement rules and how to trigger the behavior they want.
  • Suffix invocation:
    • Speak the preceding user text even when no prior assistant message exists.
  • 无前助手消息的独立调用:
    • 说明没有可朗读的上一条助手消息。
    • 询问用户是要提供文本,还是要朗读当前文本。
  • 无前助手消息的前缀调用:
    • 说明对话中没有上一条助手消息。
    • 询问是否要朗读当前用户文本。
    • 如果用户不满或困惑,解释调用位置规则以及如何触发他们想要的行为。
  • 后缀调用:
    • 即使没有上一条助手消息,也朗读调用指令之前的用户文本。

Long-content handling

长内容处理

  • Estimate length using approximate whitespace-based word count.
  • If selected text is longer than about 250 words, ask before synthesis with choices:
    • Speak full
    • Summarize then speak
      (recommended)
    • Cancel
  • If the user chooses summary, generate a concise summary first, then synthesize the summary.
  • If the user chooses cancel, do not synthesize.
  • 基于空格的近似单词数估算内容长度。
  • 如果所选文本超过约250词,合成前需询问用户并提供以下选项:
    • 完整朗读
    • 先总结再朗读
      (推荐)
    • 取消
  • 如果用户选择总结,先生成简洁的摘要,再合成摘要的语音。
  • 如果用户选择取消,则不进行合成。

User dissatisfaction fallback

用户不满的 fallback 处理

Execution order

执行顺序

  1. Resolve source text using the rules above.
  2. Apply long-content confirmation behavior if needed.
  3. Run the existing synthesis flow.
  4. Preserve existing load/retry/playback behavior.
  1. 使用上述规则确定源文本。
  2. 如有需要,执行长内容确认流程。
  3. 运行现有的合成流程。
  4. 保留现有的加载/重试/播放逻辑。

What this skill does

该Skill的功能

  • Calls the local TalkToMePy v0.5+ service (
    /health
    ,
    /model/load
    ,
    /model/status
    ,
    /synthesize/voice-design
    )
  • Handles async model loading behavior (
    /model/load
    may return
    202
    )
  • Retries synthesis on
    503
    using
    Retry-After
  • Saves generated WAV output to
    ./tts_outputs
    in the current working directory by default
  • Plays audio via
    afplay
    on macOS
  • 调用本地TalkToMePy v0.5+服务(接口包括
    /health
    /model/load
    /model/status
    /synthesize/voice-design
  • 处理异步模型加载逻辑(
    /model/load
    可能返回
    202
    状态码)
  • 当遇到
    503
    状态码时,根据
    Retry-After
    头部信息重试合成请求
  • 默认将生成的WAV音频保存到当前工作目录的
    ./tts_outputs
    文件夹
  • 在macOS系统上通过
    afplay
    播放音频

Preconditions

前置条件

  • TalkToMePy service is running (default
    http://127.0.0.1:8000
    )
  • macOS
    afplay
    is available
  • TalkToMePy服务正在运行(默认地址为
    http://127.0.0.1:8000
  • macOS系统上已安装
    afplay
    工具

Default workflow

默认工作流

  1. Resolve which text to speak using post-invocation resolution rules.
  2. Ensure service is healthy:
    • curl -fsS http://127.0.0.1:8000/health
  3. Trigger model load (idempotent):
    • curl -sS -X POST http://127.0.0.1:8000/model/load -H "Content-Type: application/json" -d '{"mode":"voice_design","strict_load":false}'
  4. Wait for ready state via
    /model/status
  5. Synthesize + save + play using bundled script:
    • scripts/speak_with_talktomepy.sh --text "..."
  1. 使用调用后解析规则确定要朗读的文本。
  2. 检查服务健康状态:
    • curl -fsS http://127.0.0.1:8000/health
  3. 触发模型加载(幂等操作):
    • curl -sS -X POST http://127.0.0.1:8000/model/load -H "Content-Type: application/json" -d '{"mode":"voice_design","strict_load":false}'
  4. 通过
    /model/status
    接口等待模型进入就绪状态
  5. 使用内置脚本完成合成、保存和播放:
    • scripts/speak_with_talktomepy.sh --text "..."

Script usage

脚本使用方法

bash
scripts/speak_with_talktomepy.sh --text "Read this text aloud"
Defaults:
  • language
    :
    English
  • default style:
    energetic
    (warm/friendly/brisk feminine-or-androgynous)
  • output path:
    ./tts_outputs/tts-YYYYMMDD-HHMMSS.wav
Style preset flags:
  • --style-energetic
  • --style-soft
  • --style-neutral
Alternative style syntax:
  • --style energetic|soft|neutral
Optional flags:
  • --instruct "..."
    fully custom voice/style instruction
  • --language English
  • --base-url http://127.0.0.1:8000
  • --save /path/output.wav
    custom save path
  • --no-play
    generate only, do not play
Optional env var overrides:
  • TALKTOMEPY_BASE_URL
  • TALKTOMEPY_OUTPUT_DIR
  • TALKTOMEPY_MAX_WAIT_SECONDS
  • TALKTOMEPY_MAX_SYNTH_RETRIES
  • TALKTOMEPY_DEFAULT_RETRY_AFTER_SECONDS
bash
scripts/speak_with_talktomepy.sh --text "Read this text aloud"
默认配置:
  • language
    :
    English
  • 默认风格:
    energetic
    (热情友好、轻快的女性或中性音色)
  • 输出路径:
    ./tts_outputs/tts-YYYYMMDD-HHMMSS.wav
风格预设参数:
  • --style-energetic
  • --style-soft
  • --style-neutral
风格参数的替代写法:
  • --style energetic|soft|neutral
可选参数:
  • --instruct "..."
    完全自定义的语音/风格指令
  • --language English
    指定语言
  • --base-url http://127.0.0.1:8000
    指定服务地址
  • --save /path/output.wav
    自定义保存路径
  • --no-play
    仅生成音频,不播放
可选环境变量覆盖配置:
  • TALKTOMEPY_BASE_URL
  • TALKTOMEPY_OUTPUT_DIR
  • TALKTOMEPY_MAX_WAIT_SECONDS
  • TALKTOMEPY_MAX_SYNTH_RETRIES
  • TALKTOMEPY_DEFAULT_RETRY_AFTER_SECONDS

Automation Templates

自动化模板

Use
$talktomepy-tts
inside automation prompts so Codex loads the service checks and synthesis guardrails in this skill.
For ready-to-fill Codex App and Codex CLI (
codex exec
) templates, including unattended-safe defaults (
--no-play
) and placeholders, use:
  • references/automation-prompts.md
在自动化提示中使用
$talktomepy-tts
,以便Codex加载该Skill中的服务检查和合成防护逻辑。
如需可直接填充的Codex App和Codex CLI(
codex exec
)模板(包含无人值守安全默认值
--no-play
和占位符),请参考:
  • references/automation-prompts.md

References

参考资料

  • Automation prompt templates:
    references/automation-prompts.md
If synthesis fails, surface HTTP status/body and suggest checking:
  • /model/status
  • launchd logs:
    ~/Library/Logs/talktomepy.stderr.log
  • 自动化提示模板:
    references/automation-prompts.md
如果合成失败,显示HTTP状态码和响应体,并建议检查:
  • /model/status
    接口
  • launchd日志:
    ~/Library/Logs/talktomepy.stderr.log