audio-downloader

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Audio Downloader

Audio Downloader

批量下载网站音频资源,支持需要登录的网站,自动去重和生成报告。
Batch download audio resources from websites, supports sites requiring login, with automatic deduplication and report generation.

📖 核心流程

📖 Core Process

1. 访问页面

1. Access Page

python
browser_use(action="open", url=target_url)
browser_use(action="snapshot")  # 检查是否需要登录
python
browser_use(action="open", url=target_url)
browser_use(action="snapshot")  # Check if login is required

2. 判断意图

2. Intent Judgment

  • 用户说"下载全部"、"批量下载" → 下载播放列表所有音频
  • 用户说"下载这个"、"保存音频" → 仅下载当前音频
  • If user says "download all", "batch download" → Download all audios in the playlist
  • If user says "download this", "save audio" → Only download the current audio

3. 收集音频 URL 并保存

3. Collect Audio URLs and Save

当前音频

Current Audio

javascript
() => {
  const audio = document.querySelector('audio');
  if (!audio) return JSON.stringify({urls: []});
  const url = audio.src || audio.querySelector('source')?.src;
  return JSON.stringify({urls: [{index: 1, url, name: document.title || 'audio'}]});
}
javascript
() => {
  const audio = document.querySelector('audio');
  if (!audio) return JSON.stringify({urls: []});
  const url = audio.src || audio.querySelector('source')?.src;
  return JSON.stringify({urls: [{index: 1, url, name: document.title || 'audio'}]});
}

播放列表

Playlist

根据页面特点选择方法:
  1. 网络拦截:拦截 fetch/XHR 获取音频 URL
  2. 源码提取:从页面源码或全局变量匹配音频 URL
  3. 点击加载:逐个点击播放列表项,从
    <audio>
    标签获取 URL
Select method based on page characteristics:
  1. Network Interception: Intercept fetch/XHR to obtain audio URLs
  2. Source Code Extraction: Match audio URLs from page source code or global variables
  3. Click to Load: Click each playlist item one by one, get URL from
    <audio>
    tag

保存 URL 列表

Save URL List

将收集的 URL 保存为 JSON 文件:
json
{
  "urls": [
    {"index": 1, "name": "音频名称", "url": "https://..."},
    {"index": 2, "name": "音频名称", "url": "https://..."}
  ]
}
Save the collected URLs as a JSON file:
json
{
  "urls": [
    {"index": 1, "name": "audio name", "url": "https://..."},
    {"index": 2, "name": "audio name", "url": "https://..."}
  ]
}

4. 获取认证信息

4. Obtain Authentication Information

javascript
() => JSON.stringify({
  cookies: document.cookie,
  referer: window.location.href
})
javascript
() => JSON.stringify({
  cookies: document.cookie,
  referer: window.location.href
})

5. 批量下载

5. Batch Download

bash
python scripts/audio_downloader.py urls.json -k "关键词" -r "Referer" -c "Cookie"
bash
python scripts/audio_downloader.py urls.json -k "keyword" -r "Referer" -c "Cookie"

📊 命令行参数

📊 Command Line Parameters

python scripts/audio_downloader.py url_file -k KEYWORD -r REFERER [-c COOKIES] [-d DELAY]

参数:
  url_file    URL JSON 文件
  -k          关键词(必需)
  -r          Referer URL(必需)
  -c          Cookie 字符串
  -d          下载间隔秒数

Version: 0.0.1 Author: pax
python scripts/audio_downloader.py url_file -k KEYWORD -r REFERER [-c COOKIES] [-d DELAY]

Parameters:
  url_file    URL JSON file
  -k          Keyword (required)
  -r          Referer URL (required)
  -c          Cookie string
  -d          Download interval in seconds

Version: 0.0.1 Author: pax