libfuzzer

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

libFuzzer

libFuzzer is an in-process, coverage-guided fuzzer that is part of the LLVM project. It's the recommended starting point for fuzzing C/C++ projects due to its simplicity and integration with the LLVM toolchain. While libFuzzer has been in maintenance-only mode since late 2022, it is easier to install and use than its alternatives, has wide support, and will be maintained for the foreseeable future.

libFuzzer是LLVM项目的一部分，是一个进程内的覆盖率引导模糊测试器。由于其简单易用且与LLVM工具链深度集成，它是C/C++项目模糊测试的推荐入门工具。尽管libFuzzer自2022年末起进入仅维护模式，但它比其他替代工具更易于安装和使用，支持范围广泛，并会在可预见的未来持续维护。

When to Use

适用场景

Fuzzer	Best For	Complexity
libFuzzer	Quick setup, single-project fuzzing	Low
AFL++	Multi-core fuzzing, diverse mutations	Medium
LibAFL	Custom fuzzers, research projects	High
Honggfuzz	Hardware-based coverage	Medium

Choose libFuzzer when:

You need a simple, quick setup for C/C++ code
Project uses Clang for compilation
Single-core fuzzing is sufficient initially
Transitioning to AFL++ later is an option (harnesses are compatible)

Note: Fuzzing harnesses written for libFuzzer are compatible with AFL++, making it easy to transition if you need more advanced features like better multi-core support.

模糊测试器	适用场景	复杂度
libFuzzer	快速搭建、单项目模糊测试	低
AFL++	多核模糊测试、多样化变异	中
LibAFL	自定义模糊测试器、研究项目	高
Honggfuzz	基于硬件的覆盖率反馈	中

选择libFuzzer的场景：

你需要为C/C++代码搭建简单、快速的模糊测试环境
项目使用Clang进行编译
初始阶段单核模糊测试已足够
后续可能过渡到AFL++（测试 harness 兼容）

注意： 为libFuzzer编写的测试harness与AFL++兼容，因此当你需要更高级的功能（如更好的多核支持）时，可以轻松过渡。

Quick Start

快速开始

#include <stdint.h>
#include <stddef.h>

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // Validate input if needed
    if (size < 1) return 0;

    // Call your target function with fuzzer-provided data
    my_target_function(data, size);

    return 0;
}

Compile and run:

bash

clang++ -fsanitize=fuzzer,address -g -O2 harness.cc target.cc -o fuzz
mkdir corpus/
./fuzz corpus/

#include <stdint.h>
#include <stddef.h>

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // 如有需要，验证输入
    if (size < 1) return 0;

    // 使用模糊测试器提供的输入调用目标函数
    my_target_function(data, size);

    return 0;
}

编译并运行：

bash

clang++ -fsanitize=fuzzer,address -g -O2 harness.cc target.cc -o fuzz
mkdir corpus/
./fuzz corpus/

Installation

安装

Prerequisites

前置条件

LLVM/Clang compiler (includes libFuzzer)
LLVM tools for coverage analysis (optional)

LLVM/Clang编译器（已包含libFuzzer）
用于覆盖率分析的LLVM工具（可选）

Linux (Ubuntu/Debian)

Linux（Ubuntu/Debian）

bash

apt install clang llvm

For the latest LLVM version:

bash

undefined

bash

apt install clang llvm

安装最新版本的LLVM：

bash

undefined

Add LLVM repository from apt.llvm.org

从apt.llvm.org添加LLVM仓库

Then install specific version, e.g.:

然后安装特定版本，例如：

apt install clang-18 llvm-18

undefined

apt install clang-18 llvm-18

undefined

macOS

bash

undefined

bash

undefined

Using Homebrew

使用Homebrew

brew install llvm

Or using Nix

或使用Nix

nix-env -i clang

undefined

nix-env -i clang

undefined

Windows

Install Clang through Visual Studio. Refer to Microsoft's documentation for setup instructions.

Recommendation: If possible, fuzz on a local x86_64 VM or rent one on DigitalOcean, AWS, or Hetzner. Linux provides the best support for libFuzzer.

通过Visual Studio安装Clang。请参考微软官方文档进行设置。

建议： 如有可能，在本地x86_64虚拟机上进行模糊测试，或在DigitalOcean、AWS或Hetzner上租用虚拟机。Linux对libFuzzer的支持最佳。

Verification

验证安装

bash

clang++ --version

bash

clang++ --version

Should show LLVM version information

应显示LLVM版本信息

undefined

undefined

Writing a Harness

编写测试Harness

Harness Structure

Harness结构

The harness is the entry point for the fuzzer. libFuzzer calls the

LLVMFuzzerTestOneInput

function repeatedly with different inputs.

#include <stdint.h>
#include <stddef.h>

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // 1. Optional: Validate input size
    if (size < MIN_REQUIRED_SIZE) {
        return 0;  // Reject inputs that are too small
    }

    // 2. Optional: Convert raw bytes to structured data
    // Example: Parse two integers from byte array
    if (size >= 2 * sizeof(uint32_t)) {
        uint32_t a = *(uint32_t*)(data);
        uint32_t b = *(uint32_t*)(data + sizeof(uint32_t));
        my_function(a, b);
    }

    // 3. Call target function
    target_function(data, size);

    // 4. Always return 0 (non-zero reserved for future use)
    return 0;
}

Harness是模糊测试器的入口点。libFuzzer会反复调用

LLVMFuzzerTestOneInput

函数，并传入不同的输入。

#include <stdint.h>
#include <stddef.h>

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // 1. 可选：验证输入大小
    if (size < MIN_REQUIRED_SIZE) {
        return 0;  // 拒绝过小的输入
    }

    // 2. 可选：将原始字节转换为结构化数据
    // 示例：从字节数组中解析两个整数
    if (size >= 2 * sizeof(uint32_t)) {
        uint32_t a = *(uint32_t*)(data);
        uint32_t b = *(uint32_t*)(data + sizeof(uint32_t));
        my_function(a, b);
    }

    // 3. 调用目标函数
    target_function(data, size);

    // 4. 始终返回0（非零值为未来预留）
    return 0;
}

Harness Rules

Harness编写规则

Do	Don't
Handle all input types (empty, huge, malformed)	Call `exit()` - stops fuzzing process
Join all threads before returning	Leave threads running
Keep harness fast and simple	Add excessive logging or complexity
Maintain determinism	Use random number generators or read `/dev/random`
Reset global state between runs	Rely on state from previous executions
Use narrow, focused targets	Mix unrelated data formats (PNG + TCP) in one harness

Rationale:

Speed matters: Aim for 100s-1000s executions per second per core
Reproducibility: Crashes must be reproducible after fuzzing completes
Isolation: Each execution should be independent

建议做法	禁止做法
处理所有输入类型（空输入、超大输入、格式错误的输入）	调用 `exit()` - 会终止模糊测试进程
返回前终止所有线程	让线程持续运行
保持Harness快速简洁	添加过多日志或复杂逻辑
保持确定性	使用随机数生成器或读取 `/dev/random`
在每次运行之间重置全局状态	依赖前一次执行的状态
使用范围狭窄、目标明确的测试对象	在一个Harness中混合不相关的数据格式（如PNG + TCP）

原理：

速度至关重要： 目标是每个核心每秒执行100-1000次测试
可复现性： 模糊测试完成后，必须能够复现崩溃问题
独立性： 每次执行都应独立

Using FuzzedDataProvider for Complex Inputs

使用FuzzedDataProvider处理复杂输入

For complex inputs (strings, multiple parameters), use the

FuzzedDataProvider

helper:

#include <stdint.h>
#include <stddef.h>
#include "FuzzedDataProvider.h"  // From LLVM project

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    FuzzedDataProvider fuzzed_data(data, size);

    // Extract structured data
    size_t allocation_size = fuzzed_data.ConsumeIntegral<size_t>();
    std::vector<char> str1 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
    std::vector<char> str2 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);

    // Call target with extracted data
    char* result = concat(&str1[0], str1.size(), &str2[0], str2.size(), allocation_size);
    if (result != NULL) {
        free(result);
    }

    return 0;
}

Download

FuzzedDataProvider.h

from the LLVM repository.

对于复杂输入（字符串、多个参数），使用

FuzzedDataProvider

辅助工具：

#include <stdint.h>
#include <stddef.h>
#include "FuzzedDataProvider.h"  // 来自LLVM项目

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    FuzzedDataProvider fuzzed_data(data, size);

    // 提取结构化数据
    size_t allocation_size = fuzzed_data.ConsumeIntegral<size_t>();
    std::vector<char> str1 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
    std::vector<char> str2 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);

    // 使用提取的数据调用目标函数
    char* result = concat(&str1[0], str1.size(), &str2[0], str2.size(), allocation_size);
    if (result != NULL) {
        free(result);
    }

    return 0;
}

从LLVM仓库下载

FuzzedDataProvider.h

。

Interleaved Fuzzing

交错模糊测试

Use a single harness to test multiple related functions:

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    if (size < 1 + 2 * sizeof(int32_t)) {
        return 0;
    }

    uint8_t mode = data[0];
    int32_t numbers[2];
    memcpy(numbers, data + 1, 2 * sizeof(int32_t));

    // Select function based on first byte
    switch (mode % 4) {
        case 0: add(numbers[0], numbers[1]); break;
        case 1: subtract(numbers[0], numbers[1]); break;
        case 2: multiply(numbers[0], numbers[1]); break;
        case 3: divide(numbers[0], numbers[1]); break;
    }

    return 0;
}

See Also: For detailed harness writing techniques, patterns for handling complex inputs, structure-aware fuzzing, and protobuf-based fuzzing, see the fuzz-harness-writing technique skill.

使用单个Harness测试多个相关函数：

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    if (size < 1 + 2 * sizeof(int32_t)) {
        return 0;
    }

    uint8_t mode = data[0];
    int32_t numbers[2];
    memcpy(numbers, data + 1, 2 * sizeof(int32_t));

    // 根据第一个字节选择要调用的函数
    switch (mode % 4) {
        case 0: add(numbers[0], numbers[1]); break;
        case 1: subtract(numbers[0], numbers[1]); break;
        case 2: multiply(numbers[0], numbers[1]); break;
        case 3: divide(numbers[0], numbers[1]); break;
    }

    return 0;
}

另请参阅： 有关编写Harness的详细技巧、处理复杂输入的模式、感知结构的模糊测试以及基于protobuf的模糊测试，请参考fuzz-harness-writing技术文档。

Compilation

编译

Basic Compilation

基础编译

The key flag is

-fsanitize=fuzzer

, which:

Links the libFuzzer runtime (provides
```
main
```
function)
Enables SanitizerCoverage instrumentation for coverage tracking
Disables built-in functions like
```
memcmp
```

bash

clang++ -fsanitize=fuzzer -g -O2 harness.cc target.cc -o fuzz

Flags explained:

```
-fsanitize=fuzzer
```
: Enable libFuzzer
```
-g
```
: Add debug symbols (helpful for crash analysis)
```
-O2
```
: Production-level optimizations (recommended for fuzzing)
```
-DNO_MAIN
```
: Define macro if your code has a
```
main
```
function

关键编译标志是

-fsanitize=fuzzer

，它的作用包括：

链接libFuzzer运行时库（提供
```
main
```
函数）
启用SanitizerCoverage插桩以跟踪覆盖率
禁用
```
memcmp
```
等内置函数

bash

clang++ -fsanitize=fuzzer -g -O2 harness.cc target.cc -o fuzz

标志说明：

```
-fsanitize=fuzzer
```
：启用libFuzzer
```
-g
```
：添加调试符号（有助于崩溃分析）
```
-O2
```
：生产级优化（推荐用于模糊测试）
```
-DNO_MAIN
```
：如果你的代码包含
```
main
```
函数，定义此宏

With Sanitizers

结合Sanitizer使用

AddressSanitizer (recommended):

bash

clang++ -fsanitize=fuzzer,address -g -O2 -U_FORTIFY_SOURCE harness.cc target.cc -o fuzz

Multiple sanitizers:

bash

clang++ -fsanitize=fuzzer,address,undefined -g -O2 harness.cc target.cc -o fuzz

See Also: For detailed sanitizer configuration, common issues, ASAN_OPTIONS flags, and advanced sanitizer usage, see the address-sanitizer and undefined-behavior-sanitizer technique skills.

AddressSanitizer（推荐使用）：

bash

clang++ -fsanitize=fuzzer,address -g -O2 -U_FORTIFY_SOURCE harness.cc target.cc -o fuzz

同时使用多个Sanitizer：

bash

clang++ -fsanitize=fuzzer,address,undefined -g -O2 harness.cc target.cc -o fuzz

另请参阅： 有关Sanitizer的详细配置、常见问题、ASAN_OPTIONS标志以及高级用法，请参考address-sanitizer和undefined-behavior-sanitizer技术文档。

Build Flags

编译标志

Flag	Purpose
`-fsanitize=fuzzer`	Enable libFuzzer runtime and instrumentation
`-fsanitize=address`	Enable AddressSanitizer (memory error detection)
`-fsanitize=undefined`	Enable UndefinedBehaviorSanitizer
`-fsanitize=fuzzer-no-link`	Instrument without linking fuzzer (for libraries)
`-g`	Include debug symbols
`-O2`	Production optimization level
`-U_FORTIFY_SOURCE`	Disable fortification (can interfere with ASan)

标志	用途
`-fsanitize=fuzzer`	启用libFuzzer运行时库和插桩
`-fsanitize=address`	启用AddressSanitizer（内存错误检测）
`-fsanitize=undefined`	启用UndefinedBehaviorSanitizer
`-fsanitize=fuzzer-no-link`	仅插桩不链接模糊测试器（适用于库）
`-g`	包含调试符号
`-O2`	生产级优化级别
`-U_FORTIFY_SOURCE`	禁用强化机制（可能与ASan冲突）

Building Static Libraries

编译静态库

For projects that produce static libraries:

Build the library with fuzzing instrumentation:

bash

export CC=clang CFLAGS="-fsanitize=fuzzer-no-link -fsanitize=address"
export CXX=clang++ CXXFLAGS="$CFLAGS"
./configure --enable-shared=no
make

Link the static library with your harness:

bash

clang++ -fsanitize=fuzzer -fsanitize=address harness.cc libmylib.a -o fuzz

对于生成静态库的项目：

使用模糊测试插桩编译库：

bash

export CC=clang CFLAGS="-fsanitize=fuzzer-no-link -fsanitize=address"
export CXX=clang++ CXXFLAGS="$CFLAGS"
./configure --enable-shared=no
make

将静态库与你的Harness链接：

bash

clang++ -fsanitize=fuzzer -fsanitize=address harness.cc libmylib.a -o fuzz

CMake Integration

CMake集成

cmake

project(FuzzTarget)
cmake_minimum_required(VERSION 3.0)

add_executable(fuzz main.cc harness.cc)
target_compile_definitions(fuzz PRIVATE NO_MAIN=1)
target_compile_options(fuzz PRIVATE -g -O2 -fsanitize=fuzzer -fsanitize=address)
target_link_libraries(fuzz -fsanitize=fuzzer -fsanitize=address)

Build with:

bash

cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ .
cmake --build .

cmake

project(FuzzTarget)
cmake_minimum_required(VERSION 3.0)

add_executable(fuzz main.cc harness.cc)
target_compile_definitions(fuzz PRIVATE NO_MAIN=1)
target_compile_options(fuzz PRIVATE -g -O2 -fsanitize=fuzzer -fsanitize=address)
target_link_libraries(fuzz -fsanitize=fuzzer -fsanitize=address)

使用以下命令构建：

bash

cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ .
cmake --build .

Corpus Management

语料库管理

Creating Initial Corpus

创建初始语料库

Create a directory for the corpus (can start empty):

bash

mkdir corpus/

Optional but recommended: Provide seed inputs (valid example files):

bash

undefined

创建一个用于存储语料的目录（可以从空目录开始）：

bash

mkdir corpus/

可选但推荐： 提供种子输入（有效的示例文件）：

bash

undefined

For a PNG parser:

对于PNG解析器：

cp examples/*.png corpus/

For a protocol parser:

对于协议解析器：

cp test_packets/*.bin corpus/


**Benefits of seed inputs:**
- Fuzzer doesn't start from scratch
- Reaches valid code paths faster
- Significantly improves effectiveness

cp test_packets/*.bin corpus/


**种子输入的优势：**
- 模糊测试器无需从零开始
- 更快地覆盖有效代码路径
- 显著提升测试效果

Corpus Structure

语料库结构

The corpus directory contains:

Input files that trigger unique code paths
Minimized versions (libFuzzer automatically minimizes)

Named by content hash (e.g.,

a9993e364706816aba3e25717850c26c9cd0d89d

)

语料库目录包含：

触发唯一代码路径的输入文件
经过最小化的版本（libFuzzer会自动进行最小化处理）
以内容哈希命名的文件（例如：
```
a9993e364706816aba3e25717850c26c9cd0d89d
```
）

Corpus Minimization

语料库最小化

libFuzzer automatically minimizes corpus entries during fuzzing. To explicitly minimize:

bash

mkdir minimized_corpus/
./fuzz -merge=1 minimized_corpus/ corpus/

This creates a deduplicated, minimized corpus in

minimized_corpus/

See Also: For corpus creation strategies, seed selection, format-specific corpus building, and corpus maintenance, see the fuzzing-corpus technique skill.

在模糊测试过程中，libFuzzer会自动最小化语料库条目。要显式执行最小化操作：

bash

mkdir minimized_corpus/
./fuzz -merge=1 minimized_corpus/ corpus/

此命令会在

minimized_corpus/

目录中创建一个去重、最小化后的语料库。

另请参阅： 有关语料库创建策略、种子选择、特定格式的语料库构建以及语料库维护，请参考fuzzing-corpus技术文档。

Running Campaigns

运行模糊测试任务

Basic Run

基础运行

bash

./fuzz corpus/

This runs until a crash is found or you stop it (Ctrl+C).

bash

./fuzz corpus/

此命令会持续运行，直到发现崩溃或你手动停止（Ctrl+C）。

Recommended: Continue After Crashes

推荐：发现崩溃后继续运行

bash

./fuzz -fork=1 -ignore_crashes=1 corpus/

The

-fork

and

-ignore_crashes

flags (experimental but widely used) allow fuzzing to continue after finding crashes.

bash

./fuzz -fork=1 -ignore_crashes=1 corpus/

-fork

和

-ignore_crashes

标志（实验性但被广泛使用）允许模糊测试在发现崩溃后继续运行。

Common Options

常用选项

Control input size:

bash

./fuzz -max_len=4000 corpus/

Rule of thumb: 2x the size of minimal realistic input.

Set timeout:

bash

./fuzz -timeout=2 corpus/

Abort test cases that run longer than 2 seconds.

Use a dictionary:

bash

./fuzz -dict=./format.dict corpus/

Close stdout/stderr (speed up fuzzing):

bash

./fuzz -close_fd_mask=3 corpus/

See all options:

bash

./fuzz -help=1

控制输入大小：

bash

./fuzz -max_len=4000 corpus/

经验法则：设置为最小实际输入大小的2倍。

设置超时时间：

bash

./fuzz -timeout=2 corpus/

终止运行时间超过2秒的测试用例。

使用字典：

bash

./fuzz -dict=./format.dict corpus/

关闭标准输出/标准错误（提升模糊测试速度）：

bash

./fuzz -close_fd_mask=3 corpus/

查看所有选项：

bash

./fuzz -help=1

Multi-Core Fuzzing

多核模糊测试

Option 1: Jobs and workers (recommended):

bash

./fuzz -jobs=4 -workers=4 -fork=1 -ignore_crashes=1 corpus/

```
-jobs=4
```
: Run 4 sequential campaigns
```
-workers=4
```
: Process jobs in parallel with 4 processes
Test cases are shared between jobs

Option 2: Fork mode:

bash

./fuzz -fork=4 -ignore_crashes=1 corpus/

Note: For serious multi-core fuzzing, consider switching to AFL++, Honggfuzz, or LibAFL.

选项1：Jobs和workers（推荐）：

bash

./fuzz -jobs=4 -workers=4 -fork=1 -ignore_crashes=1 corpus/

```
-jobs=4
```
：运行4个连续的测试任务
```
-workers=4
```
：使用4个进程并行处理任务
测试用例在不同任务之间共享

选项2：Fork模式：

bash

./fuzz -fork=4 -ignore_crashes=1 corpus/

注意： 如果需要专业的多核模糊测试，考虑切换到AFL++、Honggfuzz或LibAFL。

Re-executing Test Cases

重新执行测试用例

Re-run a single crash:

bash

./fuzz ./crash-a9993e364706816aba3e25717850c26c9cd0d89d

Test all inputs in a directory without fuzzing:

bash

./fuzz -runs=0 corpus/

重新运行单个崩溃用例：

bash

./fuzz ./crash-a9993e364706816aba3e25717850c26c9cd0d89d

测试目录中的所有输入但不进行模糊测试：

bash

./fuzz -runs=0 corpus/

Interpreting Output

输出解读

When fuzzing runs, you'll see statistics like:

INFO: Seed: 3517090860
INFO: Loaded 1 modules (9 inline 8-bit counters)
#2      INITED cov: 3 ft: 4 corp: 1/1b exec/s: 0 rss: 26Mb
#57     NEW    cov: 4 ft: 5 corp: 2/4b lim: 4 exec/s: 0 rss: 26Mb

Output	Meaning
`INITED`	Fuzzing initialized
`NEW`	New coverage found, added to corpus
`REDUCE`	Input minimized while keeping coverage
`cov: N`	Number of coverage edges hit
`corp: X/Yb`	Corpus size: X entries, Y total bytes
`exec/s: N`	Executions per second
`rss: NMb`	Resident memory usage

On crash:

==11672== ERROR: libFuzzer: deadly signal
artifact_prefix='./'; Test unit written to ./crash-a9993e364706816aba3e25717850c26c9cd0d89d
0x61,0x62,0x63,
abc
Base64: YWJj

The crash is saved to

./crash-<hash>

with the input shown in hex, UTF-8, and Base64.

Reproducibility: Use

-seed=<value>

to reproduce a fuzzing campaign (single-core only).

当模糊测试运行时，你会看到如下统计信息：

INFO: Seed: 3517090860
INFO: Loaded 1 modules (9 inline 8-bit counters)
#2      INITED cov: 3 ft: 4 corp: 1/1b exec/s: 0 rss: 26Mb
#57     NEW    cov: 4 ft: 5 corp: 2/4b lim: 4 exec/s: 0 rss: 26Mb

输出内容	含义
`INITED`	模糊测试已初始化
`NEW`	发现新的代码覆盖率，已添加到语料库
`REDUCE`	在保持覆盖率的前提下最小化输入
`cov: N`	命中的覆盖率边数量
`corp: X/Yb`	语料库大小：X个条目，总大小Y字节
`exec/s: N`	每秒执行次数
`rss: NMb`	驻留内存使用量

发现崩溃时：

==11672== ERROR: libFuzzer: deadly signal
artifact_prefix='./'; Test unit written to ./crash-a9993e364706816aba3e25717850c26c9cd0d89d
0x61,0x62,0x63,
abc
Base64: YWJj

崩溃信息会保存到

./crash-<hash>

文件中，同时会以十六进制、UTF-8和Base64格式显示导致崩溃的输入。

可复现性： 使用

-seed=<value>

参数复现模糊测试任务（仅适用于单核模式）。

Fuzzing Dictionary

模糊测试字典

Dictionaries help the fuzzer discover interesting inputs faster by providing hints about the input format.

字典通过提供输入格式的提示，帮助模糊测试器更快地发现有意义的输入。

Dictionary Format

字典格式

Create a text file with quoted strings (one per line):

conf

undefined

创建一个文本文件，每行包含一个带引号的字符串：

conf

undefined

Lines starting with '#' are comments

以'#'开头的行是注释

Magic bytes

魔术字节

magic="\x89PNG" magic2="IEND"

Keywords

关键字

"GET" "POST" "Content-Type"

Hex sequences

十六进制序列

delimiter="\xFF\xD8\xFF"

undefined

delimiter="\xFF\xD8\xFF"

undefined

Using a Dictionary

使用字典

bash

./fuzz -dict=./format.dict corpus/

bash

./fuzz -dict=./format.dict corpus/

Generating a Dictionary

生成字典

From header files:

bash

grep -o '".*"' header.h > header.dict

From man pages:

bash

man curl | grep -oP '^\s*(--|-)\K\S+' | sed 's/[,.]$//' | sed 's/^/"&/; s/$/&"/' | sort -u > man.dict

From binary strings:

bash

strings ./binary | sed 's/^/"&/; s/$/&"/' > strings.dict

Using LLMs: Ask ChatGPT or similar to generate a dictionary for your format (e.g., "Generate a libFuzzer dictionary for a JSON parser").

See Also: For advanced dictionary generation, format-specific dictionaries, and dictionary optimization strategies, see the fuzzing-dictionaries technique skill.

从头文件生成：

bash

grep -o '".*"' header.h > header.dict

从手册页生成：

bash

man curl | grep -oP '^\s*(--|-)\K\S+' | sed 's/[,.]$//' | sed 's/^/"&/; s/$/&"/' | sort -u > man.dict

从二进制文件中的字符串生成：

bash

strings ./binary | sed 's/^/"&/; s/$/&"/' > strings.dict

使用大语言模型生成： 向ChatGPT等工具请求为你的输入格式生成字典（例如："为JSON解析器生成一个libFuzzer字典"）。

另请参阅： 有关高级字典生成、特定格式的字典以及字典优化策略，请参考fuzzing-dictionaries技术文档。

Coverage Analysis

覆盖率分析

While libFuzzer shows basic coverage stats (

cov: N

), detailed coverage analysis requires additional tools.

虽然libFuzzer会显示基础的覆盖率统计信息（

cov: N

），但详细的覆盖率分析需要额外的工具。

Source-Based Coverage

基于源代码的覆盖率分析

1. Recompile with coverage instrumentation:

bash

clang++ -fsanitize=fuzzer -fprofile-instr-generate -fcoverage-mapping harness.cc target.cc -o fuzz

2. Run fuzzer to collect coverage:

bash

LLVM_PROFILE_FILE="coverage-%p.profraw" ./fuzz -runs=10000 corpus/

3. Merge coverage data:

bash

llvm-profdata merge -sparse coverage-*.profraw -o coverage.profdata

4. Generate coverage report:

bash

llvm-cov show ./fuzz -instr-profile=coverage.profdata

5. Generate HTML report:

bash

llvm-cov show ./fuzz -instr-profile=coverage.profdata -format=html > coverage.html

1. 使用覆盖率插桩重新编译：

bash

clang++ -fsanitize=fuzzer -fprofile-instr-generate -fcoverage-mapping harness.cc target.cc -o fuzz

2. 运行模糊测试器以收集覆盖率数据：

bash

LLVM_PROFILE_FILE="coverage-%p.profraw" ./fuzz -runs=10000 corpus/

3. 合并覆盖率数据：

bash

llvm-profdata merge -sparse coverage-*.profraw -o coverage.profdata

4. 生成覆盖率报告：

bash

llvm-cov show ./fuzz -instr-profile=coverage.profdata

5. 生成HTML格式的覆盖率报告：

bash

llvm-cov show ./fuzz -instr-profile=coverage.profdata -format=html > coverage.html

Improving Coverage

提升覆盖率

Tips:

Provide better seed inputs in corpus
Use dictionaries for format-aware fuzzing
Check if harness properly exercises target
Consider structure-aware fuzzing for complex formats
Run longer campaigns (days/weeks)

See Also: For detailed coverage analysis techniques, identifying coverage gaps, systematic coverage improvement, and comparing coverage across fuzzers, see the coverage-analysis technique skill.

技巧：

在语料库中提供更好的种子输入
使用字典进行感知格式的模糊测试
检查Harness是否正确覆盖目标函数
对于复杂格式，考虑使用感知结构的模糊测试
运行更长时间的测试任务（数天/数周）

另请参阅： 有关详细的覆盖率分析技巧、识别覆盖率缺口、系统性提升覆盖率以及比较不同模糊测试器的覆盖率，请参考coverage-analysis技术文档。

Sanitizer Integration

Sanitizer集成

AddressSanitizer (ASan)

AddressSanitizer（ASan）

ASan detects memory errors like buffer overflows and use-after-free bugs. Highly recommended for fuzzing.

Enable ASan:

bash

clang++ -fsanitize=fuzzer,address -g -O2 -U_FORTIFY_SOURCE harness.cc target.cc -o fuzz

Example ASan output:

==1276163==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000c4ab1
WRITE of size 1 at 0x6020000c4ab1 thread T0
    #0 0x55555568631a in check_buf(char*, unsigned long) main.cc:13:25
    #1 0x5555556860bf in LLVMFuzzerTestOneInput harness.cc:7:3

Configure ASan with environment variables:

bash

ASAN_OPTIONS=verbosity=1:abort_on_error=1 ./fuzz corpus/

Important flags:

```
verbosity=1
```
: Show ASan is active
```
detect_leaks=0
```
: Disable leak detection (leaks reported at end)
```
abort_on_error=1
```
: Call
```
abort()
```
instead of
```
_exit()
```
on errors

Drawbacks:

2-4x slowdown
Requires ~20TB virtual memory (disable memory limits:
```
-rss_limit_mb=0
```
)
Best supported on Linux

See Also: For comprehensive ASan configuration, common pitfalls, symbolization, and combining with other sanitizers, see the address-sanitizer technique skill.

AddressSanitizer用于检测缓冲区溢出、释放后使用等内存错误。在模糊测试中强烈推荐使用。

启用AddressSanitizer：

bash

clang++ -fsanitize=fuzzer,address -g -O2 -U_FORTIFY_SOURCE harness.cc target.cc -o fuzz

AddressSanitizer输出示例：

==1276163==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000c4ab1
WRITE of size 1 at 0x6020000c4ab1 thread T0
    #0 0x55555568631a in check_buf(char*, unsigned long) main.cc:13:25
    #1 0x5555556860bf in LLVMFuzzerTestOneInput harness.cc:7:3

使用环境变量配置AddressSanitizer：

bash

ASAN_OPTIONS=verbosity=1:abort_on_error=1 ./fuzz corpus/

重要配置标志：

```
verbosity=1
```
：显示AddressSanitizer已激活
```
detect_leaks=0
```
：禁用泄漏检测（泄漏信息会在测试结束时报告）
```
abort_on_error=1
```
：发生错误时调用
```
abort()
```
而非
```
_exit()
```

缺点：

导致测试速度下降2-4倍
需要约20TB的虚拟内存（禁用内存限制：
```
-rss_limit_mb=0
```
）
在Linux上支持最佳

另请参阅： 有关AddressSanitizer的全面配置、常见陷阱、符号化以及与其他Sanitizer结合使用，请参考address-sanitizer技术文档。

UndefinedBehaviorSanitizer (UBSan)

UndefinedBehaviorSanitizer（UBSan）

UBSan detects undefined behavior like integer overflow, null pointer dereference, etc.

Enable UBSan:

bash

clang++ -fsanitize=fuzzer,undefined -g -O2 harness.cc target.cc -o fuzz

Combine with ASan:

bash

clang++ -fsanitize=fuzzer,address,undefined -g -O2 harness.cc target.cc -o fuzz

UndefinedBehaviorSanitizer用于检测整数溢出、空指针解引用等未定义行为。

启用UndefinedBehaviorSanitizer：

bash

clang++ -fsanitize=fuzzer,undefined -g -O2 harness.cc target.cc -o fuzz

与AddressSanitizer结合使用：

bash

clang++ -fsanitize=fuzzer,address,undefined -g -O2 harness.cc target.cc -o fuzz

MemorySanitizer (MSan)

MemorySanitizer（MSan）

MSan detects uninitialized memory reads. More complex to use (requires rebuilding all dependencies).

bash

clang++ -fsanitize=fuzzer,memory -g -O2 harness.cc target.cc -o fuzz

MemorySanitizer用于检测未初始化内存的读取。使用起来更复杂（需要重新编译所有依赖项）。

bash

clang++ -fsanitize=fuzzer,memory -g -O2 harness.cc target.cc -o fuzz

Common Sanitizer Issues

常见Sanitizer问题

Issue	Solution
ASan slows fuzzing too much	Use `-fsanitize-recover=address` for non-fatal errors
Out of memory	Set `ASAN_OPTIONS=rss_limit_mb=0` or `-rss_limit_mb=0`
Stack exhaustion	Increase stack size: `ASAN_OPTIONS=stack_size=8388608`
False positives with `_FORTIFY_SOURCE`	Use `-U_FORTIFY_SOURCE` flag
MSan reports in dependencies	Rebuild all dependencies with `-fsanitize=memory`

| 问题 | 原因 | 解决方案 | |-------|----------| | AddressSanitizer导致模糊测试速度过慢 | 性能开销 | 使用

-fsanitize-recover=address

将错误设为非致命 | | 内存不足 | AddressSanitizer需要大量虚拟内存 | 设置

ASAN_OPTIONS=rss_limit_mb=0

或

-rss_limit_mb=0

| | 栈溢出 | 栈空间不足 | 增加栈大小：

ASAN_OPTIONS=stack_size=8388608

| |

_FORTIFY_SOURCE

导致误报 | 强化机制与Sanitizer冲突 | 使用

-U_FORTIFY_SOURCE

标志 | | MemorySanitizer报告依赖项中的问题 | 依赖项未使用MemorySanitizer编译 | 使用

-fsanitize=memory

重新编译所有依赖项 |

Real-World Examples

实际案例

Example 1: Fuzzing libpng

案例1：模糊测试libpng

libpng is a widely-used library for reading/writing PNG images. Bugs can lead to security issues.

1. Get source code:

bash

curl -L -O https://downloads.sourceforge.net/project/libpng/libpng16/1.6.37/libpng-1.6.37.tar.xz
tar xf libpng-1.6.37.tar.xz
cd libpng-1.6.37/

2. Install dependencies:

bash

apt install zlib1g-dev

3. Compile with fuzzing instrumentation:

bash

export CC=clang CFLAGS="-fsanitize=fuzzer-no-link -fsanitize=address"
export CXX=clang++ CXXFLAGS="$CFLAGS"
./configure --enable-shared=no
make

4. Get a harness (or write your own):

bash

curl -O https://raw.githubusercontent.com/glennrp/libpng/f8e5fa92b0e37ab597616f554bee254157998227/contrib/oss-fuzz/libpng_read_fuzzer.cc

5. Prepare corpus and dictionary:

bash

mkdir corpus/
curl -o corpus/input.png https://raw.githubusercontent.com/glennrp/libpng/acfd50ae0ba3198ad734e5d4dec2b05341e50924/contrib/pngsuite/iftp1n3p08.png
curl -O https://raw.githubusercontent.com/glennrp/libpng/2fff013a6935967960a5ae626fc21432807933dd/contrib/oss-fuzz/png.dict

6. Link and compile fuzzer:

bash

clang++ -fsanitize=fuzzer -fsanitize=address libpng_read_fuzzer.cc .libs/libpng16.a -lz -o fuzz

7. Run fuzzing campaign:

bash

./fuzz -close_fd_mask=3 -dict=./png.dict corpus/

libpng是一个广泛使用的PNG图像读写库。其中的漏洞可能导致安全问题。

1. 获取源代码：

bash

curl -L -O https://downloads.sourceforge.net/project/libpng/libpng16/1.6.37/libpng-1.6.37.tar.xz
tar xf libpng-1.6.37.tar.xz
cd libpng-1.6.37/

2. 安装依赖项：

bash

apt install zlib1g-dev

3. 使用模糊测试插桩编译：

bash

export CC=clang CFLAGS="-fsanitize=fuzzer-no-link -fsanitize=address"
export CXX=clang++ CXXFLAGS="$CFLAGS"
./configure --enable-shared=no
make

4. 获取Harness（或自行编写）：

bash

curl -O https://raw.githubusercontent.com/glennrp/libpng/f8e5fa92b0e37ab597616f554bee254157998227/contrib/oss-fuzz/libpng_read_fuzzer.cc

5. 准备语料库和字典：

bash

mkdir corpus/
curl -o corpus/input.png https://raw.githubusercontent.com/glennrp/libpng/acfd50ae0ba3198ad734e5d4dec2b05341e50924/contrib/pngsuite/iftp1n3p08.png
curl -O https://raw.githubusercontent.com/glennrp/libpng/2fff013a6935967960a5ae626fc21432807933dd/contrib/oss-fuzz/png.dict

6. 链接并编译模糊测试器：

bash

clang++ -fsanitize=fuzzer -fsanitize=address libpng_read_fuzzer.cc .libs/libpng16.a -lz -o fuzz

7. 运行模糊测试任务：

bash

./fuzz -close_fd_mask=3 -dict=./png.dict corpus/

Example 2: Simple Division Bug

案例2：简单的除零漏洞

Harness that finds a division-by-zero bug:

#include <stdint.h>
#include <stddef.h>

double divide(uint32_t numerator, uint32_t denominator) {
    // Bug: No check if denominator is zero
    return numerator / denominator;
}

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    if(size != 2 * sizeof(uint32_t)) {
        return 0;
    }

    uint32_t numerator = *(uint32_t*)(data);
    uint32_t denominator = *(uint32_t*)(data + sizeof(uint32_t));

    divide(numerator, denominator);

    return 0;
}

Compile and fuzz:

bash

clang++ -fsanitize=fuzzer harness.cc -o fuzz
./fuzz

The fuzzer will quickly find inputs causing a crash.

用于发现除零漏洞的Harness：

#include <stdint.h>
#include <stddef.h>

double divide(uint32_t numerator, uint32_t denominator) {
    // 漏洞：未检查分母是否为零
    return numerator / denominator;
}

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    if(size != 2 * sizeof(uint32_t)) {
        return 0;
    }

    uint32_t numerator = *(uint32_t*)(data);
    uint32_t denominator = *(uint32_t*)(data + sizeof(uint32_t));

    divide(numerator, denominator);

    return 0;
}

编译并进行模糊测试：

bash

clang++ -fsanitize=fuzzer harness.cc -o fuzz
./fuzz

模糊测试器会快速找到导致崩溃的输入。

Advanced Usage

高级用法

Tips and Tricks

技巧和窍门

Tip	Why It Helps
Start with single-core, switch to AFL++ for multi-core	libFuzzer harnesses work with AFL++
Use dictionaries for structured formats	10-100x faster bug discovery
Close file descriptors with `-close_fd_mask=3`	Speed boost if SUT writes output
Set reasonable `-max_len`	Prevents wasted time on huge inputs
Run for days/weeks, not minutes	Coverage plateaus take time to break
Use seed corpus from test suites	Starts fuzzing from valid inputs

技巧	优势
先使用单核模式，之后切换到AFL++进行多核测试	libFuzzer的Harness与AFL++兼容
为结构化格式使用字典	漏洞发现速度提升10-100倍
使用 `-close_fd_mask=3` 关闭文件描述符	如果被测程序会输出内容，可提升速度
设置合理的 `-max_len`	避免在超大输入上浪费时间
运行数天/数周，而非数分钟	覆盖率平台期需要时间突破
使用测试套件中的种子语料库	从有效输入开始模糊测试

Structure-Aware Fuzzing

感知结构的模糊测试

For highly structured inputs (e.g., complex protocols, file formats), use libprotobuf-mutator:

Define input structure using Protocol Buffers
libFuzzer mutates protobuf messages (structure-preserving mutations)
Harness converts protobuf to native format

See structure-aware fuzzing documentation for details.

对于高度结构化的输入（如复杂协议、文件格式），使用libprotobuf-mutator：

使用Protocol Buffers定义输入结构
libFuzzer对protobuf消息进行变异（保留结构的变异）
Harness将protobuf转换为原生格式

有关详细信息，请参阅感知结构的模糊测试文档。

Custom Mutators

自定义变异器

libFuzzer allows custom mutators for specialized fuzzing:

extern "C" size_t LLVMFuzzerCustomMutator(uint8_t *Data, size_t Size,
                                          size_t MaxSize, unsigned int Seed) {
    // Custom mutation logic
    return new_size;
}

extern "C" size_t LLVMFuzzerCustomCrossOver(const uint8_t *Data1, size_t Size1,
                                            const uint8_t *Data2, size_t Size2,
                                            uint8_t *Out, size_t MaxOutSize,
                                            unsigned int Seed) {
    // Custom crossover logic
    return new_size;
}

libFuzzer允许使用自定义变异器进行特殊化的模糊测试：

extern "C" size_t LLVMFuzzerCustomMutator(uint8_t *Data, size_t Size,
                                          size_t MaxSize, unsigned int Seed) {
    // 自定义变异逻辑
    return new_size;
}

extern "C" size_t LLVMFuzzerCustomCrossOver(const uint8_t *Data1, size_t Size1,
                                            const uint8_t *Data2, size_t Size2,
                                            uint8_t *Out, size_t MaxOutSize,
                                            unsigned int Seed) {
    // 自定义交叉变异逻辑
    return new_size;
}

Performance Tuning

性能调优

Setting	Impact
`-close_fd_mask=3`	Closes stdout/stderr, speeds up fuzzing
`-max_len=<reasonable_size>`	Avoids wasting time on huge inputs
`-timeout=<seconds>`	Detects hangs, prevents stuck executions
Disable ASan for baseline	2-4x speed boost (but misses memory bugs)
Use `-jobs` and `-workers`	Limited multi-core support
Run on Linux	Best platform support and performance

设置	影响
`-close_fd_mask=3`	关闭标准输出/标准错误，提升模糊测试速度
`-max_len=<合理大小>`	避免在超大输入上浪费时间
`-timeout=<秒数>`	检测挂起的测试用例，防止执行停滞
禁用AddressSanitizer以获取基准性能	速度提升2-4倍，但会遗漏内存漏洞
使用 `-jobs` 和 `-workers`	有限的多核支持
在Linux上运行	最佳的平台支持和性能

Troubleshooting

故障排除

Problem	Cause	Solution
No crashes found after hours	Poor corpus, low coverage	Add seed inputs, use dictionary, check harness
Very slow executions/sec (<100)	Target too complex, excessive logging	Optimize target, use `-close_fd_mask=3` , reduce logging
Out of memory	ASan's 20TB virtual memory	Set `-rss_limit_mb=0` to disable RSS limit
Fuzzer stops after first crash	Default behavior	Use `-fork=1 -ignore_crashes=1` to continue
Can't reproduce crash	Non-determinism in harness/target	Remove random number generation, global state
Linking errors with `-fsanitize=fuzzer`	Missing libFuzzer runtime	Ensure using Clang, check LLVM installation
GCC project won't compile with Clang	GCC-specific code	Switch to AFL++ with `gcc_plugin` instead
Coverage not improving	Corpus plateau	Run longer, add dictionary, improve seeds, check coverage report
Crashes but ASan doesn't trigger	Memory error not detected without ASan	Recompile with `-fsanitize=address`

问题	原因	解决方案
数小时后未发现任何崩溃	语料库质量差、覆盖率低	添加种子输入、使用字典、检查Harness
每秒执行次数极低（<100）	目标程序过于复杂、日志过多	优化目标程序、使用 `-close_fd_mask=3` 、减少日志
内存不足	AddressSanitizer需要20TB虚拟内存	设置 `-rss_limit_mb=0` 以禁用RSS限制
模糊测试器在发现第一个崩溃后停止	默认行为	使用 `-fork=1 -ignore_crashes=1` 继续运行
无法复现崩溃	Harness或目标程序存在非确定性	移除随机数生成、重置全局状态
使用 `-fsanitize=fuzzer` 时出现链接错误	缺少libFuzzer运行时库	确保使用Clang、检查LLVM安装
GCC项目无法用Clang编译	代码依赖GCC特定特性	切换到AFL++并使用 `gcc_plugin`
覆盖率不再提升	语料库进入平台期	运行更长时间、添加字典、优化种子、查看覆盖率报告
出现崩溃但AddressSanitizer未触发	未使用AddressSanitizer编译，无法检测内存错误	使用 `-fsanitize=address` 重新编译

Related Skills

Skill	Use Case
fuzz-harness-writing	Detailed guidance on writing effective harnesses, structure-aware fuzzing, and FuzzedDataProvider usage
address-sanitizer	Memory error detection configuration, ASAN_OPTIONS, and troubleshooting
undefined-behavior-sanitizer	Detecting undefined behavior during fuzzing
coverage-analysis	Measuring fuzzing effectiveness and identifying untested code paths
fuzzing-corpus	Building and managing seed corpora, corpus minimization strategies
fuzzing-dictionaries	Creating format-specific dictionaries for faster bug discovery

文档	适用场景
fuzz-harness-writing	有关编写高效Harness、感知结构的模糊测试以及FuzzedDataProvider使用的详细指南
address-sanitizer	内存错误检测配置、ASAN_OPTIONS以及故障排除
undefined-behavior-sanitizer	在模糊测试中检测未定义行为
coverage-analysis	衡量模糊测试效果、识别未测试的代码路径
fuzzing-corpus	构建和管理种子语料库、语料库最小化策略
fuzzing-dictionaries	创建特定格式的字典以更快发现漏洞

Skill	When to Consider
aflpp	When you need serious multi-core fuzzing, or when libFuzzer coverage plateaus
honggfuzz	When you want hardware-based coverage feedback on Linux
libafl	When building custom fuzzers or conducting fuzzing research

文档	适用场景
aflpp	当你需要专业的多核模糊测试，或libFuzzer的覆盖率进入平台期时
honggfuzz	当你需要在Linux上使用基于硬件的覆盖率反馈时
libafl	当你需要构建自定义模糊测试器或进行模糊测试研究时

官方文档

LLVM libFuzzer Documentation - Official reference
libFuzzer Tutorial by Google - Step-by-step guide
SanitizerCoverage - Coverage instrumentation details

LLVM libFuzzer官方文档 - 官方参考文档
Google libFuzzer教程 - 分步指南
SanitizerCoverage文档 - 覆盖率插桩详细信息

Advanced Topics

高级主题

Example Projects

示例项目

OSS-Fuzz - Continuous fuzzing for open-source projects (many libFuzzer examples)
AFL++ Dictionary Collection - Reusable dictionaries

OSS-Fuzz - 开源项目的持续模糊测试（包含大量libFuzzer示例）
AFL++字典集合 - 可复用的字典集合