firecrawl

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Firecrawl CLI

Firecrawl CLI

Web scraping, search, and browser automation CLI. Returns clean markdown optimized for LLM context windows.
Run
firecrawl --help
or
firecrawl <command> --help
for full option details.
网页抓取、搜索和浏览器自动化CLI,返回针对LLM上下文窗口优化的干净markdown内容。
运行
firecrawl --help
firecrawl <command> --help
查看完整的选项说明。

Prerequisites

前置条件

Must be installed and authenticated. Check with
firecrawl --status
.
  🔥 firecrawl cli v1.8.0

  ● Authenticated via FIRECRAWL_API_KEY
  Concurrency: 0/100 jobs (parallel scrape limit)
  Credits: 500,000 remaining
  • Concurrency: Max parallel jobs. Run parallel operations up to this limit.
  • Credits: Remaining API credits. Each scrape/crawl consumes credits.
If not ready, see rules/install.md. For output handling guidelines, see rules/security.md.
bash
firecrawl search "query" --scrape --limit 3
必须完成安装和身份验证,可通过
firecrawl --status
检查状态:
  🔥 firecrawl cli v1.8.0

  ● Authenticated via FIRECRAWL_API_KEY
  Concurrency: 0/100 jobs (parallel scrape limit)
  Credits: 500,000 remaining
  • 并发数:最大并行任务数,可运行不超过此限制的并行操作
  • 点数:剩余API点数,每次抓取/爬取操作都会消耗点数
如果未就绪,请查看rules/install.md。输出处理指南请查看rules/security.md
bash
firecrawl search "query" --scrape --limit 3

Workflow

工作流

Follow this escalation pattern:
  1. Search - No specific URL yet. Find pages, answer questions, discover sources.
  2. Scrape - Have a URL. Extract its content directly.
  3. Map + Scrape - Large site or need a specific subpage. Use
    map --search
    to find the right URL, then scrape it.
  4. Crawl - Need bulk content from an entire site section (e.g., all /docs/).
  5. Browser - Scrape failed because content is behind interaction (pagination, modals, form submissions, multi-step navigation).
NeedCommandWhen
Find pages on a topic
search
No specific URL yet
Get a page's content
scrape
Have a URL, page is static or JS-rendered
Find URLs within a site
map
Need to locate a specific subpage
Bulk extract a site section
crawl
Need many pages (e.g., all /docs/)
AI-powered data extraction
agent
Need structured data from complex sites
Interact with a page
browser
Content requires clicks, form fills, pagination, or login
Download a site to files
download
Save an entire site as local files
For detailed command reference, use the individual skill for each command (e.g.,
firecrawl-search
,
firecrawl-browser
) or run
firecrawl <command> --help
.
Scrape vs browser:
  • Use
    scrape
    first. It handles static pages and JS-rendered SPAs.
  • Use
    browser
    when you need to interact with a page, such as clicking buttons, filling out forms, navigating through a complex site, infinite scroll, or when scrape fails to grab all the content you need.
  • Never use browser for web searches - use
    search
    instead.
Avoid redundant fetches:
  • search --scrape
    already fetches full page content. Don't re-scrape those URLs.
  • Check
    .firecrawl/
    for existing data before fetching again.
遵循以下升级模式:
  1. 搜索 - 暂无特定URL,用于查找页面、回答问题、发掘信息来源
  2. 抓取 - 已有确定URL,直接提取其内容
  3. 映射+抓取 - 针对大型站点或需要查找特定子页面的场景,使用
    map --search
    找到正确URL后再进行抓取
  4. 爬取 - 需要获取整个站点版块的批量内容(例如所有/docs/路径下的页面)
  5. 浏览器 - 常规抓取失败,因为内容需要交互操作(分页、模态框、表单提交、多步骤导航等)
所需场景命令适用时机
查找某主题相关的页面
search
暂无特定URL
获取单个页面的内容
scrape
已有确定URL,页面为静态或JS渲染
查找站点内的URL
map
需要定位特定子页面
批量提取站点版块内容
crawl
需要获取多个页面(例如所有/docs/路径下的内容)
AI驱动的数据提取
agent
需要从复杂站点获取结构化数据
与页面进行交互
browser
内容需要点击、表单填写、分页操作或登录才能访问
将站点下载为本地文件
download
把整个站点保存为本地文件
如需详细的命令参考,可使用每个命令对应的独立技能(例如
firecrawl-search
firecrawl-browser
)或运行
firecrawl <command> --help
抓取 vs 浏览器模式:
  • 优先使用
    scrape
    ,它支持处理静态页面和JS渲染的单页应用
  • 当需要与页面交互时使用
    browser
    ,例如点击按钮、填写表单、在复杂站点内导航、处理无限滚动,或是scrape无法抓取到所需全部内容时
  • 绝对不要用browser模式进行网页搜索,请使用
    search
    代替
避免重复抓取:
  • search --scrape
    已经获取了完整的页面内容,不要重复抓取这些URL
  • 再次抓取前先检查
    .firecrawl/
    目录下是否已有对应数据

Output & Organization

输出与组织

Unless the user specifies to return in context, write results to
.firecrawl/
with
-o
. Add
.firecrawl/
to
.gitignore
. Always quote URLs - shell interprets
?
and
&
as special characters.
bash
firecrawl search "react hooks" -o .firecrawl/search-react-hooks.json --json
firecrawl scrape "<url>" -o .firecrawl/page.md
Naming conventions:
.firecrawl/search-{query}.json
.firecrawl/search-{query}-scraped.json
.firecrawl/{site}-{path}.md
Never read entire output files at once. Use
grep
,
head
, or incremental reads:
bash
wc -l .firecrawl/file.md && head -50 .firecrawl/file.md
grep -n "keyword" .firecrawl/file.md
Single format outputs raw content. Multiple formats (e.g.,
--format markdown,links
) output JSON.
除非用户指定要在上下文中返回结果,否则请使用
-o
参数将结果写入
.firecrawl/
目录。请将
.firecrawl/
添加到
.gitignore
中。永远要给URL加上引号——shell会将
?
&
识别为特殊字符。
bash
firecrawl search "react hooks" -o .firecrawl/search-react-hooks.json --json
firecrawl scrape "<url>" -o .firecrawl/page.md
命名规范:
.firecrawl/search-{query}.json
.firecrawl/search-{query}-scraped.json
.firecrawl/{site}-{path}.md
永远不要一次性读取整个输出文件,使用
grep
head
或增量读取的方式:
bash
wc -l .firecrawl/file.md && head -50 .firecrawl/file.md
grep -n "keyword" .firecrawl/file.md
单格式输出会返回原始内容,多格式输出(例如
--format markdown,links
)会返回JSON。

Working with Results

处理结果

These patterns are useful when working with file-based output (
-o
flag) for complex tasks:
bash
undefined
当你针对复杂任务使用基于文件的输出(
-o
参数)时,以下模式非常实用:
bash
undefined

Extract URLs from search

从搜索结果中提取URL

jq -r '.data.web[].url' .firecrawl/search.json
jq -r '.data.web[].url' .firecrawl/search.json

Get titles and URLs

获取标题和URL

jq -r '.data.web[] | "(.title): (.url)"' .firecrawl/search.json
undefined
jq -r '.data.web[] | "(.title): (.url)"' .firecrawl/search.json
undefined

Parallelization

并行化

Run independent operations in parallel. Check
firecrawl --status
for concurrency limit:
bash
firecrawl scrape "<url-1>" -o .firecrawl/1.md &
firecrawl scrape "<url-2>" -o .firecrawl/2.md &
firecrawl scrape "<url-3>" -o .firecrawl/3.md &
wait
For browser, launch separate sessions for independent tasks and operate them in parallel via
--session <id>
.
可并行运行独立的操作,运行
firecrawl --status
查看并发限制:
bash
firecrawl scrape "<url-1>" -o .firecrawl/1.md &
firecrawl scrape "<url-2>" -o .firecrawl/2.md &
firecrawl scrape "<url-3>" -o .firecrawl/3.md &
wait
对于browser模式,可以为独立任务启动单独的会话,通过
--session <id>
参数并行操作。

Credit Usage

点数使用

bash
firecrawl credit-usage
firecrawl credit-usage --json --pretty -o .firecrawl/credits.json
bash
firecrawl credit-usage
firecrawl credit-usage --json --pretty -o .firecrawl/credits.json