Search Results: crawler

Found 47 Skills

site-crawler

Crawl and extract content from websites

Tools & Utilitiesleobrival/serum-plugins-o...

website-crawler

High-performance web crawler for discovering and mapping website structure. Use when users ask to crawl a website, map site structure, discover pages, find all URLs on a site, analyze link relationships, or generate site reports. Supports sitemap discovery, checkpoint/resume, rate limiting, and HTML report generation.

🇺🇸|EnglishTranslated

11 scripts/Attention

Data Processinggn00678465/crawler-skill

crawler

Fetches web pages and converts them to clean markdown using a robust 3-tier chain (Firecrawl → Jina Reader → Scrapling stealth browser). Use this skill instead of WebFetch whenever the user provides a URL and needs the page's text content — especially for sites that block direct access: medium.com articles (paywalled/metered), WeChat public accounts (mp.weixin.qq.com, geo-restricted), documentation sites with bot protection, or any page where simple HTTP fetching might return a CAPTCHA or empty page. Triggers for: "read this URL", "summarize this article/page", "grab the content from", "extract text from", "what does this page say", "fetch this link", or any request to access and process a specific web page. Do NOT trigger for: building scrapers, checking HTTP status codes, parsing already-downloaded HTML files, answering conceptual questions about scraping tools, or monitoring page changes.

🇺🇸|EnglishTranslated

6 scripts/Checked

Data Processingalpoxdev/hypercore

crawler

[Hyper] Investigate websites with Playwriter plus CDP to choose a crawl strategy, capture API/auth evidence, document findings under `.hypercore/crawler/[site]/`, and generate crawler code only after discovery is grounded.

🇺🇸|EnglishTranslated

Data Processingvibery-studio/templates

udemy-crawler

Extract Udemy course content to markdown. Use when user asks to scrape/crawl Udemy course pages.

🇺🇸|EnglishTranslated

Document Processinggarrytan/gbrain

archive-crawler

Universal archivist for personal file archives (Dropbox/B2/Gmail-takeout/local-mount/hard-drive-dump). Filters for high-value content (the user's own writing, ideas, relationships) and surfaces it interactively. REFUSES TO RUN without an explicit gbrain.yml `archive-crawler.scan_paths:` allow-list.

🇺🇸|EnglishTranslated

Data Processingleobrival/serum-plugins-o...

web-crawler

High-performance Rust web crawler with stealth mode, LLM-ready Markdown export, multi-format output, sitemap discovery, and robots.txt support. Optimized for content extraction, site mapping, structure analysis, and LLM/RAG pipelines.

🇺🇸|EnglishTranslated

Marketing & Growthzubair-trabzada/geo-seo-c...

geo-crawlers

AI crawler access analysis. Checks robots.txt, meta tags, and HTTP headers to determine which AI crawlers can access the site. Provides a complete access map and recommendations for maximizing AI visibility while maintaining appropriate control.

🇺🇸|EnglishTranslated

Data Processingstarchild-ai-agent/offici...

web-crawler

Use when normal web_fetch cannot read a page, when a site blocks basic fetching, or when the user needs YouTube content in AI-ready form. Guides fallback use of Firecrawl for single-page web scraping and SerpApi for YouTube search/video metadata/transcripts only.

🇺🇸|EnglishTranslated

Data Processingnanmicoder/newscrawler

china-news-crawler

Content extraction for Chinese news sites. Supports WeChat Official Accounts, Toutiao, NetEase News, Sohu News, and Tencent News. Activated when users need to extract Chinese news content, crawl official account articles, scrape news, or obtain news in JSON/Markdown format.

🇨🇳|ChineseTranslated

12 scripts/Attention

Data Processingaaaaqwq/claude-code-skill...

web-scraping-automation

Automatically crawl website data and API interfaces. Use this skill when you need to scrape web content, call APIs, parse data, or create crawler scripts.

🇨🇳|ChineseTranslated

Marketing & Growthagricidaniel/claude-seo

seo-geo

Optimize content for AI Overviews (formerly SGE), ChatGPT web search, Perplexity, and other AI-powered search experiences. Generative Engine Optimization (GEO) analysis including brand mention signals, AI crawler accessibility, llms.txt compliance, passage-level citability scoring, and platform-specific optimization. Use when user says "AI Overviews", "SGE", "GEO", "AI search", "LLM optimization", "Perplexity", "AI citations", "ChatGPT search", or "AI visibility".

🇺🇸|EnglishTranslated