fuzzing

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Fuzzing

模糊测试

Purpose

用途

Guide agents through setting up and running coverage-guided fuzz testing: libFuzzer (in-process) and AFL++ (fork-based), with sanitizer integration and CI pipeline setup.
指导开发者搭建并运行覆盖率引导的模糊测试:包括进程内的libFuzzer和基于fork的AFL++,以及Sanitizer集成和CI流水线搭建。

Triggers

触发场景

  • "How do I fuzz-test my parser/deserializer?"
  • "What is a fuzz target / how do I write one?"
  • "How do I set up libFuzzer?"
  • "How do I use AFL++ on my program?"
  • "How do I run fuzzing in CI?"
  • "Fuzzer found a crash — how do I reproduce it?"
  • "如何对我的解析器/反序列化器进行模糊测试?"
  • "什么是模糊测试目标?如何编写?"
  • "如何搭建libFuzzer?"
  • "如何在我的程序中使用AFL++?"
  • "如何在CI中运行模糊测试?"
  • "模糊测试器发现了崩溃,如何复现?"

Workflow

操作流程

1. Write a fuzz target (libFuzzer)

1. 编写模糊测试目标(libFuzzer)

A fuzz target is a function that accepts arbitrary bytes and exercises the code under test.
c
// fuzz_parser.c
#include <stdint.h>
#include <stddef.h>
#include "myparser.h"

// Entry point called by libFuzzer with random data
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // Must not abort/exit on invalid input (that's expected)
    // Must not read outside [data, data+size)

    MyParser *p = parser_create();
    if (p) {
        parser_feed(p, (const char *)data, size);
        parser_destroy(p);
    }
    return 0;  // Always return 0 (non-zero means discard input)
}
Key rules:
  • Never call
    abort()
    ,
    exit()
    , or use global state that persists across calls
  • Handle all inputs gracefully (crash = bug found)
  • Keep the target fast: the fuzzer calls it millions of times
模糊测试目标是一个接受任意字节并测试目标代码的函数。
c
// fuzz_parser.c
#include <stdint.h>
#include <stddef.h>
#include "myparser.h"

// Entry point called by libFuzzer with random data
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // Must not abort/exit on invalid input (that's expected)
    // Must not read outside [data, data+size)

    MyParser *p = parser_create();
    if (p) {
        parser_feed(p, (const char *)data, size);
        parser_destroy(p);
    }
    return 0;  // Always return 0 (non-zero means discard input)
}
核心规则:
  • 绝不能调用
    abort()
    exit()
    ,或使用跨调用持久化的全局状态
  • 优雅处理所有输入(崩溃即表示发现漏洞)
  • 保持测试目标运行快速:模糊测试器会调用它数百万次

2. Build with libFuzzer

2. 使用libFuzzer构建

bash
undefined
bash
undefined

Clang (libFuzzer is built into Clang)

Clang (libFuzzer is built into Clang)

clang -fsanitize=fuzzer,address -g -O1
fuzz_parser.c myparser.c -o fuzz_parser
clang -fsanitize=fuzzer,address -g -O1
fuzz_parser.c myparser.c -o fuzz_parser

With UBSan too

With UBSan too

clang -fsanitize=fuzzer,address,undefined -g -O1
fuzz_parser.c myparser.c -o fuzz_parser

`-fsanitize=fuzzer` links libFuzzer and provides `main()`. Do not provide your own `main()` in the fuzz target.
clang -fsanitize=fuzzer,address,undefined -g -O1
fuzz_parser.c myparser.c -o fuzz_parser

`-fsanitize=fuzzer`会链接libFuzzer并提供`main()`函数。不要在模糊测试目标中自行定义`main()`。

3. Run libFuzzer

3. 运行libFuzzer

bash
undefined
bash
undefined

Create corpus directory

Create corpus directory

mkdir -p corpus
mkdir -p corpus

Seed with known-good inputs (greatly accelerates coverage)

Seed with known-good inputs (greatly accelerates coverage)

cp tests/inputs/* corpus/
cp tests/inputs/* corpus/

Run the fuzzer

Run the fuzzer

./fuzz_parser corpus/ -max_len=65536 -timeout=10
./fuzz_parser corpus/ -max_len=65536 -timeout=10

Run for a time limit

Run for a time limit

./fuzz_parser corpus/ -max_total_time=3600
./fuzz_parser corpus/ -max_total_time=3600

Run with specific number of jobs (parallel)

Run with specific number of jobs (parallel)

./fuzz_parser corpus/ -jobs=4 -workers=4
./fuzz_parser corpus/ -jobs=4 -workers=4

Minimise a corpus (remove redundant inputs)

Minimise a corpus (remove redundant inputs)

./fuzz_parser -merge=1 corpus_min/ corpus/

Common flags:

| Flag | Default | Effect |
|------|---------|--------|
| `-max_len=N` | 4096 | Max input size in bytes |
| `-timeout=N` | 1200 | Kill if single run takes > N seconds |
| `-max_total_time=N` | 0 (forever) | Total fuzzing time |
| `-runs=N` | -1 (infinite) | Total number of runs |
| `-dict=file` | none | Dictionary of interesting tokens |
| `-jobs=N` | 1 | Parallel jobs (each writes its own log) |
| `-merge=1` | off | Merge mode: minimise corpus |
./fuzz_parser -merge=1 corpus_min/ corpus/

常用参数:

| 参数 | 默认值 | 作用 |
|------|---------|--------|
| `-max_len=N` | 4096 | 最大输入字节数 |
| `-timeout=N` | 1200 | 单次运行超过N秒则终止 |
| `-max_total_time=N` | 0(无限运行) | 总模糊测试时长 |
| `-runs=N` | -1(无限次) | 总运行次数 |
| `-dict=file` | 无 | 包含感兴趣令牌的字典文件 |
| `-jobs=N` | 1 | 并行任务数(每个任务写入独立日志) |
| `-merge=1` | 关闭 | 合并模式:精简测试用例集 |

4. Reproduce a crash

4. 复现崩溃

libFuzzer writes crash inputs to files named
crash-<hash>
,
oom-<hash>
,
timeout-<hash>
.
bash
undefined
libFuzzer会将崩溃输入写入名为
crash-<hash>
oom-<hash>
timeout-<hash>
的文件中。
bash
undefined

Reproduce

Reproduce

./fuzz_parser crash-abc123
./fuzz_parser crash-abc123

Debug with GDB

Debug with GDB

gdb ./fuzz_parser (gdb) run crash-abc123
undefined
gdb ./fuzz_parser (gdb) run crash-abc123
undefined

5. AFL++ setup

5. AFL++ 搭建

AFL++ is a fork-based fuzzer that works on arbitrary programs (not just those with a fuzz entry point).
bash
undefined
AFL++是一款基于fork的模糊测试器,可用于任意程序(不仅限于有模糊测试入口点的程序)。
bash
undefined

Install

Install

apt install afl++ # or build from source
apt install afl++ # or build from source

Instrument the target

Instrument the target

CC=afl-clang-fast CXX=afl-clang-fast++
cmake -S . -B build-afl -DCMAKE_BUILD_TYPE=Debug cmake --build build-afl
CC=afl-clang-fast CXX=afl-clang-fast++
cmake -S . -B build-afl -DCMAKE_BUILD_TYPE=Debug cmake --build build-afl

Or compile directly

Or compile directly

afl-clang-fast -g -O1 -o prog_afl main.c myparser.c
afl-clang-fast -g -O1 -o prog_afl main.c myparser.c

Create input corpus

Create input corpus

mkdir -p afl-input afl-output echo "hello" > afl-input/seed1
mkdir -p afl-input afl-output echo "hello" > afl-input/seed1

Run

Run

afl-fuzz -i afl-input -o afl-output -- ./prog_afl @@
afl-fuzz -i afl-input -o afl-output -- ./prog_afl @@

@@ is replaced with the input file path

@@ is replaced with the input file path

For stdin-based programs: remove @@

For stdin-based programs: remove @@

afl-fuzz -i afl-input -o afl-output -- ./prog_afl
undefined
afl-fuzz -i afl-input -o afl-output -- ./prog_afl
undefined

6. AFL++ with persistent mode (faster)

6. AFL++ 持久化模式(更快)

Persistent mode avoids
fork()
per input — much faster for library fuzzing:
c
// In your harness:
#include "myparser.h"

int main(int argc, char **argv) {
    while (__AFL_LOOP(1000)) {
        // Read input
        unsigned char *buf = NULL;
        ssize_t len = read(0, &buf, MAX_SIZE);  // or use afl_custom_mutator
        parser_feed((char*)buf, len);
        free(buf);
    }
    return 0;
}
持久化模式避免每次输入都调用
fork()
——对于库模糊测试来说速度快得多:
c
// In your harness:
#include "myparser.h"

int main(int argc, char **argv) {
    while (__AFL_LOOP(1000)) {
        // Read input
        unsigned char *buf = NULL;
        ssize_t len = read(0, &buf, MAX_SIZE);  // or use afl_custom_mutator
        parser_feed((char*)buf, len);
        free(buf);
    }
    return 0;
}

7. Corpus management

7. 测试用例集管理

bash
undefined
bash
undefined

AFL++ corpus minimisation

AFL++ corpus minimisation

afl-cmin -i afl-output/default/queue -o corpus_min -- ./prog_afl @@
afl-cmin -i afl-output/default/queue -o corpus_min -- ./prog_afl @@

Merge libFuzzer corpora from multiple runs

Merge libFuzzer corpora from multiple runs

./fuzz_parser -merge=1 merged_corpus/ run1_corpus/ run2_corpus/
./fuzz_parser -merge=1 merged_corpus/ run1_corpus/ run2_corpus/

Show coverage (libFuzzer)

Show coverage (libFuzzer)

./fuzz_parser corpus/ -runs=0 -print_coverage=1
undefined
./fuzz_parser corpus/ -runs=0 -print_coverage=1
undefined

8. CI integration

8. CI 集成

yaml
undefined
yaml
undefined

GitHub Actions example

GitHub Actions example

  • name: Build fuzz targets run: | clang -fsanitize=fuzzer,address,undefined -g -O1
    fuzz_parser.c myparser.c -o fuzz_parser
  • name: Short fuzz run (regression check) run: | ./fuzz_parser corpus/ -max_total_time=60 -error_exitcode=1

    Also run known crash inputs if any:

    ls known_crashes/ 2>/dev/null | xargs -I{} ./fuzz_parser known_crashes/{}

For long-duration fuzzing, use OSS-Fuzz or ClusterFuzz infrastructure.
  • name: Build fuzz targets run: | clang -fsanitize=fuzzer,address,undefined -g -O1
    fuzz_parser.c myparser.c -o fuzz_parser
  • name: Short fuzz run (regression check) run: | ./fuzz_parser corpus/ -max_total_time=60 -error_exitcode=1

    Also run known crash inputs if any:

    ls known_crashes/ 2>/dev/null | xargs -I{} ./fuzz_parser known_crashes/{}

对于长时间模糊测试,可使用OSS-Fuzz或ClusterFuzz基础设施。

9. Dictionary files

9. 字典文件

Dictionaries contain interesting tokens to guide mutation:
bash
undefined
字典文件包含用于引导变异的感兴趣令牌:
bash
undefined

parser.dict

parser.dict

kw1="<" kw2=">" kw3="</" kw4='="' kw5="\x00" kw6="\xff\xfe"

```bash
./fuzz_parser corpus/ -dict=parser.dict
kw1="<" kw2=">" kw3="</" kw4='="' kw5="\x00" kw6="\xff\xfe"

```bash
./fuzz_parser corpus/ -dict=parser.dict

References

参考资料

For fuzz target templates, corpus seed examples, and OSS-Fuzz integration guidance, see references/targets.md.
如需模糊测试目标模板、测试用例集种子示例以及OSS-Fuzz集成指南,请查看references/targets.md

Related skills

相关技能

  • Use
    skills/runtimes/sanitizers
    to add ASan/UBSan to fuzz builds
  • Use
    skills/compilers/clang
    for Clang-specific libFuzzer flags
  • Use
    skills/debuggers/gdb
    to debug crash inputs found by the fuzzer
  • 使用
    skills/runtimes/sanitizers
    为模糊测试构建添加ASan/UBSan
  • 使用
    skills/compilers/clang
    获取Clang特定的libFuzzer参数
  • 使用
    skills/debuggers/gdb
    调试模糊测试器发现的崩溃输入