firecrawl
Original:🇺🇸 English
Translated
4 scripts
Firecrawl produces cleaner markdown than WebFetch, handles JavaScript-heavy pages, and avoids content truncation. This skill should be used when fetching URLs, scraping web pages, converting URLs to markdown, extracting web content, searching the web, crawling sites, mapping URLs, LLM-powered extraction, autonomous data gathering with the Agent API, or fetching AI-generated documentation for GitHub repos via DeepWiki. Provides complete coverage of Firecrawl v2.8.0 API endpoints including parallel agents, spark-1-fast model, and sitemap-only crawling.
3installs
Added on
NPX Install
npx skill4agent add tdimino/claude-code-minoan firecrawlTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →Firecrawl & Jina Web Scraping
Firecrawl vs WebFetch
Prefer over the WebFetch tool—it produces cleaner markdown, handles JavaScript-heavy pages, and avoids content truncation (>80% benchmark coverage). WebFetch is acceptable as a fallback when Firecrawl is unavailable.
firecrawl scrape URL --only-main-contentbash
# Preferred approach:
firecrawl scrape https://docs.example.com/api --only-main-contentToken-Efficient Scraping
Inspired by Anthropic's dynamic filtering—always filter before reasoning. This reduced input tokens by ~24% and improved accuracy by ~11% in their benchmarks.
The Principle: Search → Filter → Scrape → Filter → Reason
DO:
Search (titles/URLs only) → Evaluate relevance → Scrape top hits → Filter by section → ReasonDON'T:
Search → Scrape everything → Reason over all of itStep-by-Step Efficient Workflow
bash
# Step 1: Search — get titles/URLs only (cheap)
firecrawl search "query" --limit 20
# Step 2: Evaluate results, pick 3-5 best URLs
# Step 3: Scrape only those, filter to relevant sections
firecrawl scrape URL1 --only-main-content | \
python3 ~/.claude/skills/firecrawl/scripts/filter_web_results.py \
--sections "API,Authentication" --max-chars 5000Post-Processing with filter_web_results.py
Pipe any Firecrawl or Exa output through this script to reduce context before reasoning:
bash
# Extract only matching sections from scraped page
firecrawl scrape URL --only-main-content | \
python3 ~/.claude/skills/firecrawl/scripts/filter_web_results.py --sections "Pricing,Plans"
# Keep only paragraphs with keywords
firecrawl search "query" --scrape --pretty | \
python3 ~/.claude/skills/firecrawl/scripts/filter_web_results.py --keywords "pricing,cost" --max-chars 5000
# Extract specific JSON fields from API output
python3 ~/.claude/skills/exa-search/scripts/exa_search.py "query" --json | \
python3 ~/.claude/skills/firecrawl/scripts/filter_web_results.py --fields "title,url,text" --max-chars 3000
# Combine filters with stats
firecrawl scrape URL --only-main-content | \
python3 ~/.claude/skills/firecrawl/scripts/filter_web_results.py --sections "API" --keywords "endpoint" --compact --statsFull path:
Flags: , , , , (JSON), , , ,
python3 ~/.claude/skills/firecrawl/scripts/filter_web_results.py--sections--keywords--max-chars--max-lines--fields--strip-links--strip-images--compact--statsOther Token-Saving Patterns
- Use to strip navigation and footer boilerplate, reducing token consumption. Omit only when nav/footer content is specifically needed.
--only-main-content - Use first to find relevant subpages before scraping
firecrawl map URL --search "topic" - Use first to get URL list, evaluate, then scrape selectively
--format links - Use with
--max-charsto cap extraction lengthexa_contents.py - Use (Python API script) over full text when you need the gist, not raw content
--formats summary
Claude API Native Tools (for API Agent Builders)
Anthropic's API now offers built-in dynamic filtering tools:
web_search_20260209 / web_fetch_20260209
Header: anthropic-beta: code-execution-web-tools-2026-02-09These have built-in dynamic filtering via code execution. Use them when building Claude API agents directly. Use Firecrawl/Exa when you need: autonomous agents, batch scraping, structured extraction, domain-specific crawling, or when not on the Claude API.
Available Tools
1. Official Firecrawl CLI (firecrawl
) — Primary
firecrawlSetup:
npm install -g firecrawl-cli && firecrawl login --api-key $FIRECRAWL_API_KEY| Command | Purpose | Quick Example |
|---|---|---|
| Single page → markdown | |
| Entire site with progress | |
| Discover all URLs on a site | |
| Web search (+ optional scrape) | |
Full CLI reference:
references/cli-reference.md2. Auto-Save Alias (fc-save
) — Shell Alias
fc-saveRequires shell alias setup (not bundled with this skill).
bash
fc-save URL
# → Saves to ~/Desktop/Screencaps & Chats/Web-Scrapes/docs-example-com-api.md3. Python API Script (firecrawl_api.py
) — Advanced Features
firecrawl_api.pyCommand:
Requires: env var,
python3 ~/.claude/skills/firecrawl/scripts/firecrawl_api.py <command>FIRECRAWL_API_KEYpip install firecrawl-py requests| Command | Purpose | Quick Example |
|---|---|---|
| Web search with scraping | |
| Single URL with page actions | |
| Multiple URLs concurrently | |
| Website crawling | |
| URL discovery | |
| LLM-powered structured extraction | |
| Autonomous extraction (no URLs needed) | |
| Bulk agent queries (v2.8.0+) | |
Agent models: (10 credits, simple), (default), (thorough)
spark-1-fastspark-1-minispark-1-proFull Python API reference:
references/python-api-reference.md4. DeepWiki — GitHub Repo Documentation
bash
~/.claude/skills/firecrawl/scripts/deepwiki.sh <owner/repo> [section] [options]AI-generated wiki for any public GitHub repo. No API key required.
bash
# Overview
~/.claude/skills/firecrawl/scripts/deepwiki.sh karpathy/nanochat
# Browse sections
~/.claude/skills/firecrawl/scripts/deepwiki.sh langchain-ai/langchain --toc
# Specific section
~/.claude/skills/firecrawl/scripts/deepwiki.sh karpathy/nanochat 4.1-gpt-transformer-implementation
# Full dump for RAG
~/.claude/skills/firecrawl/scripts/deepwiki.sh openai/openai-python --all --save5. Jina Reader (jina
) — Fallback
jinaUse when Firecrawl fails or for Twitter/X URLs (Firecrawl blocks Twitter, Jina works).
bash
jina https://x.com/username/status/123456Firecrawl vs Exa vs Native Claude Tools
| Need | Best Tool | Why |
|---|---|---|
| Single page → markdown | | Cleanest output |
| Search + scrape in one shot | | Combined operation |
| Crawl entire site | | Link following + progress |
| Autonomous data finding | | No URLs needed |
| Semantic/neural search | Exa | AI-powered relevance |
| Find research papers | Exa | Academic index |
| Quick research answer | Exa | Citations + synthesis |
| Find similar pages | Exa | Competitive analysis |
| Claude API agent building | Native | Built-in dynamic filtering |
| Twitter/X content | | Only tool that works |
| GitHub repo docs | | AI-generated wiki |
| Anti-bot / Cloudflare bypass | | Local Turnstile solver |
| Element-level extraction | | Precision targeting, adaptive tracking |
| No API key scraping | | 100% local, no credentials |
| Site redesign resilience | | SQLite similarity matching |
Common Workflows
Single Page Scraping
bash
firecrawl scrape https://example.com/page --only-main-content
# Or auto-save: fc-save URL
# Or to file: firecrawl scrape URL --only-main-content -o page.mdDocumentation Crawling
bash
# Map first, then crawl relevant paths
firecrawl map https://docs.example.com --search "API"
firecrawl crawl https://docs.example.com --include-paths /api,/guides --wait --progressResearch Workflow
bash
firecrawl search "machine learning best practices 2026" --scrape --scrape-formats markdownAgent-Powered Research (No URLs Needed)
bash
python3 ~/.claude/skills/firecrawl/scripts/firecrawl_api.py agent \
"Compare pricing tiers for Firecrawl, Apify, and ScrapingBee"Troubleshooting
bash
# Check status and credits
firecrawl --status && firecrawl credit-usage
# Re-authenticate
firecrawl logout && firecrawl login --api-key $FIRECRAWL_API_KEY
# Check API key
echo $FIRECRAWL_API_KEY- Scrape fails: Try , or add
jina URLfor JS-heavy sites--wait-for 3000 - Async job stuck: Check with /
crawl-status, cancel withbatch-status/crawl-cancelbatch-cancel - Disable telemetry:
export FIRECRAWL_NO_TELEMETRY=1
Reference Documentation
| File | Contents |
|---|---|
| Full CLI parameter reference (scrape, crawl, map, search, fc-save, jina, deepwiki) |
| Full Python API script reference (all commands, SDK examples) |
| Firecrawl Search API reference |
| Agent API (spark models, parallel agents, webhooks) |
| Page actions for dynamic content (click, write, wait, scroll) |
| Brand identity extraction (colors, fonts, UI) |
Test Suite
bash
python3 ~/.claude/skills/firecrawl/scripts/test_firecrawl.py --quick # Quick validation
python3 ~/.claude/skills/firecrawl/scripts/test_firecrawl.py # Full suite
python3 ~/.claude/skills/firecrawl/scripts/test_firecrawl.py --test scrape # Specific test