audio-downloader

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Audio Downloader

批量下载网站音频资源，支持需要登录的网站，自动去重和生成报告。

Batch download audio resources from websites, supports sites requiring login, with automatic deduplication and report generation.

📖 核心流程

📖 Core Process

1. 访问页面

1. Access Page

python

browser_use(action="open", url=target_url)
browser_use(action="snapshot")  # 检查是否需要登录

python

browser_use(action="open", url=target_url)
browser_use(action="snapshot")  # Check if login is required

2. 判断意图

2. Intent Judgment

用户说"下载全部"、"批量下载" → 下载播放列表所有音频
用户说"下载这个"、"保存音频" → 仅下载当前音频

If user says "download all", "batch download" → Download all audios in the playlist
If user says "download this", "save audio" → Only download the current audio

3. 收集音频 URL 并保存

3. Collect Audio URLs and Save

当前音频

Current Audio

javascript

() => {
  const audio = document.querySelector('audio');
  if (!audio) return JSON.stringify({urls: []});
  const url = audio.src || audio.querySelector('source')?.src;
  return JSON.stringify({urls: [{index: 1, url, name: document.title || 'audio'}]});
}

javascript

() => {
  const audio = document.querySelector('audio');
  if (!audio) return JSON.stringify({urls: []});
  const url = audio.src || audio.querySelector('source')?.src;
  return JSON.stringify({urls: [{index: 1, url, name: document.title || 'audio'}]});
}

播放列表

Playlist

根据页面特点选择方法：

网络拦截：拦截 fetch/XHR 获取音频 URL
源码提取：从页面源码或全局变量匹配音频 URL
点击加载：逐个点击播放列表项，从
```
<audio>
```
标签获取 URL

Select method based on page characteristics:

Network Interception: Intercept fetch/XHR to obtain audio URLs
Source Code Extraction: Match audio URLs from page source code or global variables
Click to Load: Click each playlist item one by one, get URL from
```
<audio>
```
tag

保存 URL 列表

Save URL List

将收集的 URL 保存为 JSON 文件：

json

{
  "urls": [
    {"index": 1, "name": "音频名称", "url": "https://..."},
    {"index": 2, "name": "音频名称", "url": "https://..."}
  ]
}

Save the collected URLs as a JSON file:

json

{
  "urls": [
    {"index": 1, "name": "audio name", "url": "https://..."},
    {"index": 2, "name": "audio name", "url": "https://..."}
  ]
}

4. 获取认证信息

4. Obtain Authentication Information

javascript

() => JSON.stringify({
  cookies: document.cookie,
  referer: window.location.href
})

javascript

() => JSON.stringify({
  cookies: document.cookie,
  referer: window.location.href
})

5. 批量下载

5. Batch Download

bash

python scripts/audio_downloader.py urls.json -k "关键词" -r "Referer" -c "Cookie"

bash

python scripts/audio_downloader.py urls.json -k "keyword" -r "Referer" -c "Cookie"

📊 命令行参数

📊 Command Line Parameters

python scripts/audio_downloader.py url_file -k KEYWORD -r REFERER [-c COOKIES] [-d DELAY]

参数:
  url_file    URL JSON 文件
  -k          关键词（必需）
  -r          Referer URL（必需）
  -c          Cookie 字符串
  -d          下载间隔秒数

Version: 0.0.1 Author: pax

python scripts/audio_downloader.py url_file -k KEYWORD -r REFERER [-c COOKIES] [-d DELAY]

Parameters:
  url_file    URL JSON file
  -k          Keyword (required)
  -r          Referer URL (required)
  -c          Cookie string
  -d          Download interval in seconds

Version: 0.0.1 Author: pax