Loading...
Loading...
Fetch and parse WeChat Official Account articles. Extract titles, authors, official account names, main content, images and metadata from WeChat article links. It is used when users provide WeChat article links (mp.weixin.qq.com/s/...) and want to read, extract, download or convert article content. Applicable scenarios include obtaining/downloading WeChat articles, extracting text or metadata from WeChat articles, converting WeChat articles to Markdown, and saving WeChat articles along with images locally. Keywords: WeChat Official Account, article acquisition, article scraping, article download.
npx skill4agent add wwwzhouhui/skills_collection wechat-article-fetcherpython scripts/fetch_wechat_article.py "https://mp.weixin.qq.com/s/xxxxx"python scripts/fetch_wechat_article.py "url1" "url2" "url3" --output-dir ./outputpython scripts/fetch_wechat_article.py "url1,url2,url3" --output-dir ./outputpython scripts/fetch_wechat_article.py "https://mp.weixin.qq.com/s/xxxxx" --jsonpip install beautifulsoup4 html2text requestspython scripts/fetch_wechat_article.py "<url>" --output-dir ./outputoutput/<公众号名称>/<日期>_<标题>/
├── index.html # Formatted standalone HTML file
├── article.md # Markdown version
├── meta.json # Article metadata
└── images/ # Downloaded imagespython scripts/fetch_wechat_article.py "<url>" --jsontitleauthoraccount_nicknamedescriptioncreate_timecontent_textcontent_markdowncover_imagesource_urlpython scripts/fetch_wechat_article.py "url1" "url2" "url3" --output-dir ./outputpython scripts/fetch_wechat_article.py "url1,url2,url3" --output-dir ./outputpython scripts/fetch_wechat_article.py "url1" "url2" --interval 5python scripts/fetch_wechat_article.py "<url>" --no-imagespython scripts/fetch_wechat_article.py "<url>" --no-imagesfrom scripts.fetch_wechat_article import fetch_article, batch_fetch
# Fetch and save single article
result = fetch_article("https://mp.weixin.qq.com/s/xxxxx", output_dir="./output")
print(result['title'], result['path'])
# Only fetch metadata for single article
meta = fetch_article("https://mp.weixin.qq.com/s/xxxxx", json_only=True)
print(meta['title'])
print(meta['content_text'][:200])
# Batch fetch
urls = ["https://mp.weixin.qq.com/s/aaa", "https://mp.weixin.qq.com/s/bbb"]
stats = batch_fetch(urls, output_dir="./output", interval=3.0)
print(f"Success: {stats['success']} articles, Fail: {stats['fail']} articles")urloutput_dir./wechat_articlesdownload_imgTrueto_markdownTruejson_onlybatch_fetchurlsinterval3.0/s/xxxxx__biz--intervaldata-srcsrcReferer: https://mp.weixin.qq.com/