x-scraper
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseX.com Post Scraper
X.com 帖子抓取工具
Extracts recent posts from X.com users with full engagement data using authenticated cookies.
使用已认证的cookies抓取X.com用户的近期帖子及完整互动数据。
Quick Start
快速开始
Basic command:
bash
cd .opencode/skills/x-scraper/scripts
python3 scraper.py <username> [count]Example:
bash
python3 scraper.py example_user 15Output:
/tmp/x_{username}_posts.json基础命令:
bash
cd .opencode/skills/x-scraper/scripts
python3 scraper.py <username> [count]示例:
bash
python3 scraper.py example_user 15输出:
/tmp/x_{username}_posts.jsonPrerequisites
前置要求
Before first use, verify environment requirements:
- Python 3.11+: Check with
python3 --version - Playwright: Check with
python3 -c "import playwright" - Cookie file: Check with
ls /tmp/x_cookies_pw.json
If any prerequisite is missing, see references/setup.md for detailed installation and configuration guide.
首次使用前,请确认环境满足以下要求:
- Python 3.11+: 使用检查
python3 --version - Playwright: 使用检查
python3 -c "import playwright" - Cookie文件: 使用检查
ls /tmp/x_cookies_pw.json
若缺少任何前置条件,请查看references/setup.md获取详细的安装和配置指南。
Common Workflows
常见工作流程
First-time setup
首次设置
See references/setup.md for complete environment configuration.
查看references/setup.md获取完整的环境配置指南。
Daily scraping
日常抓取
bash
python3 scraper.py <username> [count]bash
python3 scraper.py <username> [count]Custom cookie file
自定义Cookie文件
bash
python3 scraper.py <username> [count] --cookie-file /path/to/cookies.jsonbash
python3 scraper.py <username> [count] --cookie-file /path/to/cookies.jsonTroubleshooting
故障排除
If scraper fails, see references/troubleshooting.md for common issues and solutions.
若抓取工具运行失败,请查看references/troubleshooting.md获取常见问题及解决方案。
Output Format
输出格式
json
{
"index": 1,
"username": "example_user",
"postId": "1234567890123456789",
"publishTime": "2025-12-03T18:28:32.000Z",
"postLink": "https://x.com/example_user/status/1234567890123456789",
"textContent": "Post text content...",
"views": "471K",
"likes": "1.1K",
"retweets": "153",
"replies": "44"
}Key fields:
- - Direct URL to post
postLink - - ISO 8601 timestamp
publishTime - - Abbreviated metrics (K, M)
views/likes/retweets/replies
json
{
"index": 1,
"username": "example_user",
"postId": "1234567890123456789",
"publishTime": "2025-12-03T18:28:32.000Z",
"postLink": "https://x.com/example_user/status/1234567890123456789",
"textContent": "Post text content...",
"views": "471K",
"likes": "1.1K",
"retweets": "153",
"replies": "44"
}关键字段:
- - 帖子直接URL
postLink - - ISO 8601格式时间戳
publishTime - - 缩写形式的互动数据(K代表千,M代表百万)
views/likes/retweets/replies
When to Use This Skill
何时使用该工具
Trigger when user requests:
- "整理 @某人 最近的发言"
- "看看某人在X上说了什么"
- "Scrape X.com posts from @username"
- "Get latest tweets from user"
- "Analyze X user's recent posts"
当用户提出以下请求时触发:
- "整理 @某人 最近的发言"
- "看看某人在X上说了什么"
- "Scrape X.com posts from @username"
- "Get latest tweets from user"
- "Analyze X user's recent posts"
Available Scripts
可用脚本
scraper.py
- Main scraper
scraper.pyscraper.py
- 主抓取脚本
scraper.pybash
python3 scraper.py <username> [count] [--cookie-file <path>]- Scrapes user timeline with replies
- Default count: 10 posts
- Default cookie:
/tmp/x_cookies_pw.json
bash
python3 scraper.py <username> [count] [--cookie-file <path>]- 抓取用户时间线及回复内容
- 默认抓取数量:10条帖子
- 默认Cookie文件:
/tmp/x_cookies_pw.json
convert_cookies.py
- Cookie converter
convert_cookies.pyconvert_cookies.py
- Cookie转换工具
convert_cookies.pybash
python3 convert_cookies.py <input-file> [output-file]- Converts Cookie-Editor JSON to Playwright format
- Required before first scraping
bash
python3 convert_cookies.py <input-file> [output-file]- 将Cookie-Editor格式的JSON转换为Playwright兼容格式
- 首次抓取前必须执行
Reference Documents
参考文档
- setup.md - Complete environment setup guide (Python, Playwright, cookies)
- troubleshooting.md - Error diagnosis and solutions
- usage.md - Detailed usage examples and advanced options
- setup.md - 完整环境设置指南(Python、Playwright、Cookie配置)
- troubleshooting.md - 错误诊断与解决方案
- usage.md - 详细使用示例及高级选项
Limitations
限制说明
- Requires X.com authentication cookies
- Cookies expire (~7 days), need re-export
- Rate limits may apply
- Cannot access private/protected accounts
- 需要X.com的认证Cookie
- Cookie有效期约7天,到期后需重新导出
- 存在速率限制
- 无法访问私密/受保护的账号