chat-with-anyone

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Chat with Anyone

与任何人聊天

Chat with any real person or fictional character in their own voice by automatically finding their speech online, extracting a clean reference sample, and using it to generate replies.
通过自动在网络上查找角色的公开语音、提取干净的参考样本并用来生成回复,你可以用任何真实人物或虚构角色原本的声音与其聊天。

Triggers

触发词

  • 我想跟xxx聊天 (I want to chat with xxx)
  • 你来扮演xxx跟我说话 (Play the role of xxx and talk to me)
  • 让xxx给我讲讲这篇文章 (Let xxx explain this article to me)
  • 用xxx的声音说 (Say this in xxx's voice)
  • Talk to me like xxx
  • Roleplay as xxx
  • 我想跟xxx聊天 (I want to chat with xxx)
  • 你来扮演xxx跟我说话 (Play the role of xxx and talk to me)
  • 让xxx给我讲讲这篇文章 (Let xxx explain this article to me)
  • 用xxx的声音说 (Say this in xxx's voice)
  • Talk to me like xxx
  • Roleplay as xxx

Workflow

工作流

When the user asks you to roleplay or chat as a specific character, follow these steps exactly:
当用户要求你扮演特定角色或和特定角色聊天时,请严格遵循以下步骤:

1. Character Disambiguation

1. 角色消歧

If the user's description is ambiguous (e.g., "US President", "Spider-Man actor"), ask for clarification first to determine the exact person or specific portrayal they want.
如果用户的描述存在歧义(例如“美国总统”、“蜘蛛侠演员”),请先要求用户澄清,确认他们想要的具体人物或特定演绎版本。

2. Find a Reference Video

2. 查找参考视频

Use your web search capabilities to find a YouTube, Bilibili, or TikTok video of the character speaking clearly.
  • Look for interviews, speeches, or monologues where there is little to no background music.
  • Grab the URL of the best candidate video.
使用你的网页搜索能力,找到YouTube、Bilibili或TikTok上该角色清晰发言的视频。
  • 优先选择几乎没有背景音乐的采访、演讲或独白内容。
  • 获取最合适的候选视频的URL。

3. Download Video and Subtitles

3. 下载视频和字幕

Use the
youtube-downloader
skill to download the video and its auto-generated subtitles. Wait for the download to complete before proceeding.
bash
undefined
使用
youtube-downloader
skill下载视频及其自动生成的字幕。等待下载完成后再进行后续操作。
bash
undefined

Example using youtube-downloader

Example using youtube-downloader

python skills/youtube-downloader/scripts/download_video.py "VIDEO_URL" -o "tmp/character_audio" --audio-only --subtitles
undefined
python skills/youtube-downloader/scripts/download_video.py "VIDEO_URL" -o "tmp/character_audio" --audio-only --subtitles
undefined

4. Extract Audio Segment

4. 提取音频片段

Read the downloaded subtitle file (e.g.,
.vtt
or
.srt
) to find a continuous 10-30 second segment where the character is speaking clearly without long pauses. Note the start and end timestamps.
Use
ffmpeg
to extract this specific audio segment as a
.wav
file to use as the reference audio.
bash
undefined
读取下载的字幕文件(例如
.vtt
.srt
格式),找到一段10-30秒的连续片段,要求角色发言清晰、无长时间停顿。记下起止时间戳。
使用
ffmpeg
将该特定音频片段提取为
.wav
文件,用作参考音频。
bash
undefined

Example: Extracting audio from 00:01:15 to 00:01:30

Example: Extracting audio from 00:01:15 to 00:01:30

ffmpeg -y -i "tmp/character_audio/VideoTitle.m4a" -ss 00:01:15 -to 00:01:30 -c:a pcm_s16le -ar 24000 -ac 1 "skills/chat-with-anyone/character_name_ref.wav"
undefined
ffmpeg -y -i "tmp/character_audio/VideoTitle.m4a" -ss 00:01:15 -to 00:01:30 -c:a pcm_s16le -ar 24000 -ac 1 "skills/chat-with-anyone/character_name_ref.wav"
undefined

5. Generate Speech and Roleplay

5. 生成语音并角色扮演

Respond to the user's prompt while staying in character. Use the
tts
skill with the extracted audio as
--ref-audio
to generate the spoken response.
bash
undefined
保持角色设定回应用户的prompt。使用
tts
skill,将提取的音频作为
--ref-audio
参数生成语音回复。
bash
undefined

Example using tts skill

Example using tts skill

bash skills/tts/scripts/tts.sh speak -t "Hello there! I am ready to chat with you." --ref-audio "skills/character-chat/character_name_ref.wav" -o "output.wav"
undefined
bash skills/tts/scripts/tts.sh speak -t "Hello there! I am ready to chat with you." --ref-audio "skills/character-chat/character_name_ref.wav" -o "output.wav"
undefined

Example Conversation

对话示例

User: 我想跟特朗普聊天,让他给我讲个睡前故事。
Agent:
  1. (Internal) Identify character: Donald Trump.
  2. (Internal) Find a good speech video URL:
    https://www.youtube.com/watch?v=xxxxxxxx
    .
  3. (Internal) Download audio & subs:
    python skills/youtube-downloader/scripts/download_video.py "https://www.youtube.com/watch?v=xxxxxxxx" -o tmp/trump --audio-only --subtitles
  4. (Internal) Read subtitle, find good segment (e.g. 00:02:10 to 00:02:30).
  5. (Internal) Extract clean wav:
    ffmpeg -y -i "tmp/trump/audio.m4a" -ss 00:02:10 -to 00:02:30 -c:a pcm_s16le "skills/chat-with-anyone/trump_ref.wav"
  6. (Internal) Generate TTS:
    bash skills/tts/scripts/tts.sh speak -t "Let me tell you a tremendous story, maybe the best story ever told..." --ref-audio "skills/chat-with-anyone/trump_ref.wav" -o "trump_story.wav"
  7. Agent replies: "Here is the audio of the bedtime story from Donald Trump!" (Presents
    trump_story.wav
    and the text).
User: 我想跟特朗普聊天,让他给我讲个睡前故事。
Agent:
  1. (内部操作) 识别角色: Donald Trump。
  2. (内部操作) 找到合适的演讲视频URL:
    https://www.youtube.com/watch?v=xxxxxxxx
  3. (内部操作) 下载音频和字幕:
    python skills/youtube-downloader/scripts/download_video.py "https://www.youtube.com/watch?v=xxxxxxxx" -o tmp/trump --audio-only --subtitles
  4. (内部操作) 读取字幕,找到合适的片段(例如00:02:10到00:02:30)。
  5. (内部操作) 提取干净的wav文件:
    ffmpeg -y -i "tmp/trump/audio.m4a" -ss 00:02:10 -to 00:02:30 -c:a pcm_s16le "skills/chat-with-anyone/trump_ref.wav"
  6. (内部操作) 生成TTS:
    bash skills/tts/scripts/tts.sh speak -t "Let me tell you a tremendous story, maybe the best story ever told..." --ref-audio "skills/chat-with-anyone/trump_ref.wav" -o "trump_story.wav"
  7. Agent回复: "这是唐纳德·特朗普讲的睡前故事音频!"(展示
    trump_story.wav
    文件和对应文本)。

Dependencies

依赖项

  • youtube-downloader: For fetching videos and subtitles.
  • ffmpeg: For trimming and converting audio formats.
  • tts: For generating the final speech using
    --ref-audio
    (typically requires Noiz backend for voice cloning).
  • youtube-downloader: 用于获取视频和字幕。
  • ffmpeg: 用于裁剪和转换音频格式。
  • tts: 用于使用
    --ref-audio
    参数生成最终语音(通常需要Noiz后端支持语音克隆)。