harness-writing

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Writing Fuzzing Harnesses

编写Fuzzing Harnesses

A fuzzing harness is the entrypoint function that receives random data from the fuzzer and routes it to your system under test (SUT). The quality of your harness directly determines which code paths get exercised and whether critical bugs are found. A poorly written harness can miss entire subsystems or produce non-reproducible crashes.

Fuzzing Harness是从模糊测试器接收随机数据并将其传递给被测系统（SUT）的入口函数。Harness的质量直接决定了哪些代码路径会被执行，以及是否能发现关键漏洞。编写不佳的Harness可能会遗漏整个子系统，或者导致无法复现的崩溃。

Overview

概述

The harness is the bridge between the fuzzer's random byte generation and your application's API. It must parse raw bytes into meaningful inputs, call target functions, and handle edge cases gracefully. The most important part of any fuzzing setup is the harness—if written poorly, critical parts of your application may not be covered.

Harness是模糊测试器的随机字节生成器与应用程序API之间的桥梁。它必须将原始字节解析为有意义的输入，调用目标函数，并优雅地处理边缘情况。任何模糊测试设置中最重要的部分就是Harness——如果编写得不好，应用程序的关键部分可能无法被覆盖。

Key Concepts

核心概念

Concept	Description
Harness	Function that receives fuzzer input and calls target code under test
SUT	System Under Test—the code being fuzzed
Entry point	Function signature required by the fuzzer (e.g., `LLVMFuzzerTestOneInput` )
FuzzedDataProvider	Helper class for structured extraction of typed data from raw bytes
Determinism	Property that ensures same input always produces same behavior
Interleaved fuzzing	Single harness that exercises multiple operations based on input

概念	描述
Harness	接收模糊测试器输入并调用被测目标代码的函数
SUT	被测系统（System Under Test）——即被模糊测试的代码
入口点（Entry point）	模糊测试器要求的函数签名（例如 `LLVMFuzzerTestOneInput` ）
FuzzedDataProvider	用于从原始字节中结构化提取类型化数据的辅助类
确定性（Determinism）	确保相同输入始终产生相同行为的特性
交错模糊测试（Interleaved fuzzing）	可根据输入执行多个操作的单一Harness

When to Apply

适用场景

Apply this technique when:

Creating a new fuzz target for the first time
Fuzz campaign has low code coverage or isn't finding bugs
Crashes found during fuzzing are not reproducible
Target API requires complex or structured inputs
Multiple related functions should be tested together

Skip this technique when:

Using existing well-tested harnesses from your project
Tool provides automatic harness generation that meets your needs
Target already has comprehensive fuzzing infrastructure

在以下场景中应用此技术：

首次创建新的模糊测试目标时
模糊测试活动的代码覆盖率低或未发现漏洞时
模糊测试中发现的崩溃无法复现时
目标API需要复杂或结构化的输入时
需要同时测试多个相关函数时

在以下场景中跳过此技术：

使用项目中已有的经过充分测试的Harness时
工具提供的自动Harness生成功能满足需求时
目标已具备全面的模糊测试基础设施时

Quick Reference

快速参考

Task	Pattern
Minimal C++ harness	`extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size)`
Minimal Rust harness	`fuzz_target!(
Size validation	`if (size < MIN_SIZE) return 0;`
Cast to integers	`uint32_t val = (uint32_t)(data);`
Use FuzzedDataProvider	`FuzzedDataProvider fuzzed_data(data, size);`
Extract typed data (C++)	`auto val = fuzzed_data.ConsumeIntegral<uint32_t>();`
Extract string (C++)	`auto str = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);`

任务	实现模式
最简C++ Harness	`extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size)`
最简Rust Harness	`fuzz_target!(
大小验证	`if (size < MIN_SIZE) return 0;`
转换为整数	`uint32_t val = (uint32_t)(data);`
使用FuzzedDataProvider	`FuzzedDataProvider fuzzed_data(data, size);`
提取类型化数据（C++）	`auto val = fuzzed_data.ConsumeIntegral<uint32_t>();`
提取字符串（C++）	`auto str = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);`

Step-by-Step

分步指南

Step 1: Identify Entry Points

步骤1：确定入口点

Find functions in your codebase that:

Accept external input (parsers, validators, protocol handlers)
Parse complex data formats (JSON, XML, binary protocols)
Perform security-critical operations (authentication, cryptography)
Have high cyclomatic complexity or many branches

Good targets are typically:

Protocol parsers
File format parsers
Serialization/deserialization functions
Input validation routines

在代码库中找到以下函数：

接收外部输入的函数（解析器、验证器、协议处理程序）
解析复杂数据格式的函数（JSON、XML、二进制协议）
执行安全关键操作的函数（身份验证、加密）
圈复杂度高或分支较多的函数

理想的测试目标通常是：

协议解析器
文件格式解析器
序列化/反序列化函数
输入验证例程

Step 2: Write Minimal Harness

步骤2：编写最简Harness

Start with the simplest possible harness that calls your target function:

C/C++:

cpp

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    target_function(data, size);
    return 0;
}

Rust:

rust

#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    target_function(data);
});

从调用目标函数的最简单Harness开始：

C/C++：

cpp

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    target_function(data, size);
    return 0;
}

Rust：

rust

#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    target_function(data);
});

Step 3: Add Input Validation

步骤3：添加输入验证

Reject inputs that are too small or too large to be meaningful:

cpp

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // Ensure minimum size for meaningful input
    if (size < MIN_INPUT_SIZE || size > MAX_INPUT_SIZE) {
        return 0;
    }
    target_function(data, size);
    return 0;
}

Rationale: The fuzzer generates random inputs of all sizes. Your harness must handle empty, tiny, huge, or malformed inputs without causing unexpected issues in the harness itself (crashes in the SUT are fine—that's what we're looking for).

拒绝过小或过大的无意义输入：

cpp

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // 确保输入达到有意义的最小尺寸
    if (size < MIN_INPUT_SIZE || size > MAX_INPUT_SIZE) {
        return 0;
    }
    target_function(data, size);
    return 0;
}

原理： 模糊测试器会生成各种尺寸的随机输入。你的Harness必须能够处理空输入、极小输入、超大输入或格式错误的输入，且自身不会出现意外问题（被测系统中的崩溃是可接受的——这正是我们要寻找的）。

Step 4: Structure the Input

步骤4：结构化输入

For APIs that require typed data (integers, strings, etc.), use casting or helpers like

FuzzedDataProvider

Simple casting:

cpp

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    if (size != 2 * sizeof(uint32_t)) {
        return 0;
    }

    uint32_t numerator = *(uint32_t*)(data);
    uint32_t denominator = *(uint32_t*)(data + sizeof(uint32_t));

    divide(numerator, denominator);
    return 0;
}

Using FuzzedDataProvider:

cpp

#include "FuzzedDataProvider.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    FuzzedDataProvider fuzzed_data(data, size);

    size_t allocation_size = fuzzed_data.ConsumeIntegral<size_t>();
    std::vector<char> str1 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
    std::vector<char> str2 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);

    concat(&str1[0], str1.size(), &str2[0], str2.size(), allocation_size);
    return 0;
}

对于需要类型化数据（整数、字符串等）的API，使用强制类型转换或

FuzzedDataProvider

等辅助工具：

简单强制类型转换：

cpp

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    if (size != 2 * sizeof(uint32_t)) {
        return 0;
    }

    uint32_t numerator = *(uint32_t*)(data);
    uint32_t denominator = *(uint32_t*)(data + sizeof(uint32_t));

    divide(numerator, denominator);
    return 0;
}

使用FuzzedDataProvider：

cpp

#include "FuzzedDataProvider.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    FuzzedDataProvider fuzzed_data(data, size);

    size_t allocation_size = fuzzed_data.ConsumeIntegral<size_t>();
    std::vector<char> str1 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
    std::vector<char> str2 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);

    concat(&str1[0], str1.size(), &str2[0], str2.size(), allocation_size);
    return 0;
}

Step 5: Test and Iterate

步骤5：测试与迭代

Run the fuzzer and monitor:

Code coverage (are all interesting paths reached?)
Executions per second (is it fast enough?)
Crash reproducibility (can you reproduce crashes with saved inputs?)

Iterate on the harness to improve these metrics.

运行模糊测试器并监控以下指标：

代码覆盖率（是否覆盖了所有重要路径？）
每秒执行次数（速度是否足够快？）
崩溃可复现性（能否使用保存的输入复现崩溃？）

迭代优化Harness以提升这些指标。

Common Patterns

常见模式

Pattern: Beyond Byte Arrays—Casting to Integers

模式：字节数组之外——转换为整数

Use Case: When target expects primitive types like integers or floats

Implementation:

cpp

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // Ensure exactly 2 4-byte numbers
    if (size != 2 * sizeof(uint32_t)) {
        return 0;
    }

    // Split input into two integers
    uint32_t numerator = *(uint32_t*)(data);
    uint32_t denominator = *(uint32_t*)(data + sizeof(uint32_t));

    divide(numerator, denominator);
    return 0;
}

Rust equivalent:

rust

fuzz_target!(|data: &[u8]| {
    if data.len() != 2 * std::mem::size_of::<i32>() {
        return;
    }

    let numerator = i32::from_ne_bytes([data[0], data[1], data[2], data[3]]);
    let denominator = i32::from_ne_bytes([data[4], data[5], data[6], data[7]]);

    divide(numerator, denominator);
});

Why it works: Any 8-byte input is valid. The fuzzer learns that inputs must be exactly 8 bytes, and every bit flip produces a new, potentially interesting input.

适用场景： 目标函数期望整数或浮点数等基本类型时

实现：

cpp

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // 确保输入恰好包含2个4字节数字
    if (size != 2 * sizeof(uint32_t)) {
        return 0;
    }

    // 将输入拆分为两个整数
    uint32_t numerator = *(uint32_t*)(data);
    uint32_t denominator = *(uint32_t*)(data + sizeof(uint32_t));

    divide(numerator, denominator);
    return 0;
}

Rust等效实现：

rust

fuzz_target!(|data: &[u8]| {
    if data.len() != 2 * std::mem::size_of::<i32>() {
        return;
    }

    let numerator = i32::from_ne_bytes([data[0], data[1], data[2], data[3]]);
    let denominator = i32::from_ne_bytes([data[4], data[5], data[6], data[7]]);

    divide(numerator, denominator);
});

优势： 任何8字节输入都是有效的。模糊测试器会学习到输入必须恰好为8字节，且每一位的翻转都会产生新的、可能有趣的输入。

Pattern: FuzzedDataProvider for Complex Inputs

模式：使用FuzzedDataProvider处理复杂输入

Use Case: When target requires multiple strings, integers, or variable-length data

Implementation:

cpp

#include "FuzzedDataProvider.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    FuzzedDataProvider fuzzed_data(data, size);

    // Extract different types of data
    size_t allocation_size = fuzzed_data.ConsumeIntegral<size_t>();

    // Consume variable-length strings with terminator
    std::vector<char> str1 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
    std::vector<char> str2 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);

    char* result = concat(&str1[0], str1.size(), &str2[0], str2.size(), allocation_size);
    if (result != NULL) {
        free(result);
    }

    return 0;
}

Why it helps:

FuzzedDataProvider

handles the complexity of extracting structured data from a byte stream. It's particularly useful for APIs that need multiple parameters of different types.

适用场景： 目标函数需要多个字符串、整数或可变长度数据时

实现：

cpp

#include "FuzzedDataProvider.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    FuzzedDataProvider fuzzed_data(data, size);

    // 提取不同类型的数据
    size_t allocation_size = fuzzed_data.ConsumeIntegral<size_t>();

    // 提取带终止符的可变长度字符串
    std::vector<char> str1 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
    std::vector<char> str2 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);

    char* result = concat(&str1[0], str1.size(), &str2[0], str2.size(), allocation_size);
    if (result != NULL) {
        free(result);
    }

    return 0;
}

优势：

FuzzedDataProvider

处理了从字节流中结构化提取数据的复杂性。对于需要多个不同类型参数的API来说特别有用。

Pattern: Interleaved Fuzzing

模式：交错模糊测试

Use Case: When multiple related operations should be tested in a single harness

Implementation:

cpp

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    if (size < 1 + 2 * sizeof(int32_t)) {
        return 0;
    }

    // First byte selects operation
    uint8_t mode = data[0];

    // Next bytes are operands
    int32_t numbers[2];
    memcpy(numbers, data + 1, 2 * sizeof(int32_t));

    int32_t result = 0;
    switch (mode % 4) {
        case 0:
            result = add(numbers[0], numbers[1]);
            break;
        case 1:
            result = subtract(numbers[0], numbers[1]);
            break;
        case 2:
            result = multiply(numbers[0], numbers[1]);
            break;
        case 3:
            result = divide(numbers[0], numbers[1]);
            break;
    }

    // Prevent compiler from optimizing away the calls
    printf("%d", result);
    return 0;
}

Advantages:

Faster to write one harness than multiple individual harnesses
Single shared corpus means interesting inputs for one operation may be interesting for others
Can discover bugs in interactions between operations

When to use:

Operations share similar input types
Operations are logically related (e.g., arithmetic operations, CRUD operations)
Single corpus makes sense across all operations

适用场景： 需要在单个Harness中测试多个相关操作时

实现：

cpp

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    if (size < 1 + 2 * sizeof(int32_t)) {
        return 0;
    }

    // 第一个字节选择操作类型
    uint8_t mode = data[0];

    // 后续字节为操作数
    int32_t numbers[2];
    memcpy(numbers, data + 1, 2 * sizeof(int32_t));

    int32_t result = 0;
    switch (mode % 4) {
        case 0:
            result = add(numbers[0], numbers[1]);
            break;
        case 1:
            result = subtract(numbers[0], numbers[1]);
            break;
        case 2:
            result = multiply(numbers[0], numbers[1]);
            break;
        case 3:
            result = divide(numbers[0], numbers[1]);
            break;
    }

    // 防止编译器优化掉函数调用
    printf("%d", result);
    return 0;
}

优势：

编写一个Harness比编写多个独立Harness更快
单一共享语料库意味着针对一个操作的有趣输入可能对其他操作也有用
可以发现操作之间的交互漏洞

适用时机：

操作使用相似的输入类型时
操作在逻辑上相关时（例如算术运算、CRUD操作）
单一语料库对所有操作都有意义时

Pattern: Structure-Aware Fuzzing with Arbitrary (Rust)

模式：使用Arbitrary进行感知结构的模糊测试（Rust）

Use Case: When fuzzing Rust code that uses custom structs

Implementation:

rust

use arbitrary::Arbitrary;

#[derive(Debug, Arbitrary)]
pub struct Name {
    data: String
}

impl Name {
    pub fn check_buf(&self) {
        let data = self.data.as_bytes();
        if data.len() > 0 && data[0] == b'a' {
            if data.len() > 1 && data[1] == b'b' {
                if data.len() > 2 && data[2] == b'c' {
                    process::abort();
                }
            }
        }
    }
}

Harness with arbitrary:

rust

#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: your_project::Name| {
    data.check_buf();
});

Add to Cargo.toml:

toml

[dependencies]
arbitrary = { version = "1", features = ["derive"] }

Why it helps: The

arbitrary

crate automatically handles deserialization of raw bytes into your Rust structs, reducing boilerplate and ensuring valid struct construction.

Limitation: The arbitrary crate doesn't offer reverse serialization, so you can't manually construct byte arrays that map to specific structs. This works best when starting from an empty corpus (fine for libFuzzer, problematic for AFL++).

适用场景： 模糊测试使用自定义结构体的Rust代码时

实现：

rust

use arbitrary::Arbitrary;

#[derive(Debug, Arbitrary)]
pub struct Name {
    data: String
}

impl Name {
    pub fn check_buf(&self) {
        let data = self.data.as_bytes();
        if data.len() > 0 && data[0] == b'a' {
            if data.len() > 1 && data[1] == b'b' {
                if data.len() > 2 && data[2] == b'c' {
                    process::abort();
                }
            }
        }
    }
}

使用Arbitrary的Harness：

rust

#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: your_project::Name| {
    data.check_buf();
});

添加到Cargo.toml：

toml

[dependencies]
arbitrary = { version = "1", features = ["derive"] }

优势：

arbitrary

crate 自动处理将原始字节反序列化为Rust结构体的过程，减少样板代码并确保结构体的有效构造。

局限性： arbitrary crate 不支持反向序列化，因此无法手动构造映射到特定结构体的字节数组。这在从空语料库开始时效果很好（适用于libFuzzer），但对于AFL++来说可能存在问题。

Advanced Usage

高级用法

Tips and Tricks

技巧与窍门

Tip	Why It Helps
Start with parsers	High bug density, clear entry points, easy to harness
Mock I/O operations	Prevents hangs from blocking I/O, enables determinism
Use FuzzedDataProvider	Simplifies extraction of structured data from raw bytes
Reset global state	Ensures each iteration is independent and reproducible
Free resources in harness	Prevents memory exhaustion during long campaigns
Avoid logging in harness	Logging is slow—fuzzing needs 100s-1000s exec/sec
Test harness manually first	Run harness with known inputs before starting campaign
Check coverage early	Ensure harness reaches expected code paths

技巧	优势
从解析器开始	漏洞密度高，入口点清晰，易于编写Harness
模拟I/O操作	防止阻塞I/O导致的挂起，确保确定性
使用FuzzedDataProvider	简化从原始字节中提取结构化数据的过程
重置全局状态	确保每次迭代独立且可复现
在Harness中释放资源	防止长时间测试活动中的内存耗尽
避免在Harness中记录日志	日志记录速度慢——模糊测试需要每秒执行数百到数千次
先手动测试Harness	在开始测试活动前，使用已知输入运行Harness
尽早检查覆盖率	确保Harness覆盖了预期的代码路径

Structure-Aware Fuzzing with Protocol Buffers

使用Protocol Buffers进行感知结构的模糊测试

For highly structured input formats, consider using Protocol Buffers as an intermediate format with custom mutators:

cpp

// Define your input format in .proto file
// Use libprotobuf-mutator to generate valid mutations
// This ensures fuzzer mutates message contents, not the protobuf encoding itself

This approach is more setup but prevents the fuzzer from wasting time on unparseable inputs. See structure-aware fuzzing documentation for details.

对于高度结构化的输入格式，可以考虑使用Protocol Buffers作为中间格式，并配合自定义变异器：

cpp

// 在.proto文件中定义输入格式
// 使用libprotobuf-mutator生成有效的变异
// 确保模糊测试器变异消息内容，而非protobuf编码本身

这种方法设置更复杂，但可以防止模糊测试器在无法解析的输入上浪费时间。详情请参阅感知结构的模糊测试文档。

Handling Non-Determinism

处理非确定性

Problem: Random values or timing dependencies cause non-reproducible crashes.

Solutions:

Replace

rand()

with deterministic PRNG seeded from fuzzer input:

cpp

uint32_t seed = fuzzed_data.ConsumeIntegral<uint32_t>();
srand(seed);

Mock system calls that return time, PIDs, or random data
Avoid reading from
```
/dev/random
```
or
```
/dev/urandom
```

问题： 随机值或时间依赖导致无法复现的崩溃。

解决方案：

使用从模糊测试器输入中生成的种子替换

rand()

：

cpp

uint32_t seed = fuzzed_data.ConsumeIntegral<uint32_t>();
srand(seed);

模拟返回时间、PID或随机数据的系统调用
避免读取
```
/dev/random
```
或
```
/dev/urandom
```

Resetting Global State

重置全局状态

If your SUT uses global state (singletons, static variables), reset it between iterations:

cpp

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // Reset global state before each iteration
    global_reset();

    target_function(data, size);

    // Clean up resources
    global_cleanup();
    return 0;
}

Rationale: Global state can cause crashes after N iterations rather than on a specific input, making bugs non-reproducible.

如果被测系统使用全局状态（单例、静态变量），请在每次迭代之间重置：

cpp

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // 在每次迭代开始前重置全局状态
    global_reset();

    target_function(data, size);

    // 清理资源
    global_cleanup();
    return 0;
}

原理： 全局状态可能导致崩溃发生在N次迭代后，而非特定输入，从而使漏洞无法复现。

Practical Harness Rules

实用Harness规则

Follow these rules to ensure effective fuzzing harnesses:

Rule	Rationale
Handle all input sizes	Fuzzer generates empty, tiny, huge inputs—harness must handle gracefully
Never call `exit()`	Calling `exit()` stops the fuzzer process. Use `abort()` in SUT if needed
Join all threads	Each iteration must run to completion before next iteration starts
Be fast	Aim for 100s-1000s executions/sec. Avoid logging, high complexity, excess memory
Maintain determinism	Same input must always produce same behavior for reproducibility
Avoid global state	Global state reduces reproducibility—reset between iterations if unavoidable
Use narrow targets	Don't fuzz PNG and TCP in same harness—different formats need separate targets
Free resources	Prevent memory leaks that cause resource exhaustion during long campaigns

Note: These guidelines apply not just to harness code, but to the entire SUT. If the SUT violates these rules, consider patching it (see the fuzzing obstacles technique).

遵循以下规则以确保高效的模糊测试Harness：

规则	原理
处理所有输入尺寸	模糊测试器会生成空输入、极小输入、超大输入——Harness必须优雅处理
切勿调用 `exit()`	调用 `exit()` 会终止模糊测试器进程。如果需要，可在被测系统中使用 `abort()`
等待所有线程结束	每次迭代必须在下次迭代开始前完成执行
保持快速	目标为每秒执行数百到数千次。避免日志记录、高复杂度操作和过多内存使用
保持确定性	相同输入必须始终产生相同行为，以确保可复现性
避免全局状态	全局状态会降低可复现性——如果无法避免，请在迭代之间重置
使用窄范围目标	不要在同一个Harness中同时模糊测试PNG和TCP——不同格式需要单独的测试目标
释放资源	防止长时间测试活动中因内存泄漏导致的资源耗尽

注意： 这些准则不仅适用于Harness代码，也适用于整个被测系统。如果被测系统违反这些规则，请考虑对其进行修补（请参阅模糊测试障碍技术）。

Anti-Patterns

反模式

Anti-Pattern	Problem	Correct Approach
Global state without reset	Non-deterministic crashes	Reset all globals at start of harness
Blocking I/O or network calls	Hangs fuzzer, wastes time	Mock I/O, use in-memory buffers
Memory leaks in harness	Resource exhaustion kills campaign	Free all allocations before returning
Calling `exit()` in SUT	Stops entire fuzzing process	Use `abort()` or return error codes
Heavy logging in harness	Reduces exec/sec by orders of magnitude	Disable logging during fuzzing
Too many operations per iteration	Slows down fuzzer	Keep iterations fast and focused
Mixing unrelated input formats	Corpus entries not useful across formats	Separate harnesses for different formats
Not validating input size	Harness crashes on edge cases	Check `size` before accessing `data`

反模式	问题	正确做法
全局状态未重置	非确定性崩溃	在Harness开始时重置所有全局状态
阻塞I/O或网络调用	导致模糊测试器挂起，浪费时间	模拟I/O操作，使用内存缓冲区
Harness中的内存泄漏	资源耗尽导致测试活动终止	在返回前释放所有分配的内存
在被测系统中调用 `exit()`	终止整个模糊测试进程	使用 `abort()` 或返回错误码
Harness中大量日志记录	导致每秒执行次数大幅下降	模糊测试期间禁用日志记录
每次迭代执行过多操作	减慢模糊测试器速度	保持迭代快速且聚焦
混合不相关的输入格式	语料库条目对其他格式无用	为不同格式使用单独的Harness
未验证输入尺寸	Harness在边缘情况中崩溃	在访问 `data` 前检查 `size`

Tool-Specific Guidance

工具特定指南

libFuzzer

Harness signature:

cpp

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // Your code here
    return 0;  // Non-zero return is reserved for future use
}

Compilation:

bash

clang++ -fsanitize=fuzzer,address -g harness.cc -o fuzz_target

Integration tips:

Use
```
FuzzedDataProvider.h
```
for structured input extraction
Compile with
```
-fsanitize=fuzzer
```
to link the fuzzing runtime
Add sanitizers (
```
-fsanitize=address,undefined
```
) to detect more bugs
Use
```
-g
```
for better stack traces when crashes occur
libFuzzer can start with empty corpus—no seed inputs required

Running:

bash

./fuzz_target corpus_dir/

Resources:

Harness签名：

cpp

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // 你的代码
    return 0;  // 非零返回值预留供未来使用
}

编译：

bash

clang++ -fsanitize=fuzzer,address -g harness.cc -o fuzz_target

集成技巧：

使用
```
FuzzedDataProvider.h
```
进行结构化输入提取
编译时添加
```
-fsanitize=fuzzer
```
以链接模糊测试运行时
添加 sanitizer（
```
-fsanitize=address,undefined
```
）以检测更多漏洞
使用
```
-g
```
以在崩溃时获得更好的堆栈跟踪
libFuzzer可从空语料库开始——无需种子输入

运行：

bash

./fuzz_target corpus_dir/

资源：

AFL++

AFL++ supports multiple harness styles. For best performance, use persistent mode:

Persistent mode harness:

cpp

#include <unistd.h>

int main(int argc, char **argv) {
    #ifdef __AFL_HAVE_MANUAL_CONTROL
        __AFL_INIT();
    #endif

    unsigned char buf[MAX_SIZE];

    while (__AFL_LOOP(10000)) {
        // Read input from stdin
        ssize_t len = read(0, buf, sizeof(buf));
        if (len <= 0) break;

        // Call target function
        target_function(buf, len);
    }

    return 0;
}

Compilation:

bash

afl-clang-fast++ -g harness.cc -o fuzz_target

Integration tips:

Use persistent mode (
```
__AFL_LOOP
```
) for 10-100x speedup
Consider deferred initialization (
```
__AFL_INIT()
```
) to skip setup overhead
AFL++ requires at least one seed input in the corpus directory
Use
```
AFL_USE_ASAN=1
```
or
```
AFL_USE_UBSAN=1
```
for sanitizer builds

Running:

bash

afl-fuzz -i seeds/ -o findings/ -- ./fuzz_target

AFL++支持多种Harness风格。为获得最佳性能，请使用持久化模式：

持久化模式Harness：

cpp

#include <unistd.h>

int main(int argc, char **argv) {
    #ifdef __AFL_HAVE_MANUAL_CONTROL
        __AFL_INIT();
    #endif

    unsigned char buf[MAX_SIZE];

    while (__AFL_LOOP(10000)) {
        // 从标准输入读取输入
        ssize_t len = read(0, buf, sizeof(buf));
        if (len <= 0) break;

        // 调用目标函数
        target_function(buf, len);
    }

    return 0;
}

编译：

bash

afl-clang-fast++ -g harness.cc -o fuzz_target

集成技巧：

使用持久化模式（
```
__AFL_LOOP
```
）可获得10-100倍的速度提升
考虑使用延迟初始化（
```
__AFL_INIT()
```
）以跳过设置开销
AFL++要求语料库目录中至少有一个种子输入
使用
```
AFL_USE_ASAN=1
```
或
```
AFL_USE_UBSAN=1
```
启用sanitizer构建

运行：

bash

afl-fuzz -i seeds/ -o findings/ -- ./fuzz_target

cargo-fuzz (Rust)

cargo-fuzz（Rust）

Harness signature:

rust

#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    // Your code here
});

With structured input (arbitrary crate):

rust

#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: YourStruct| {
    data.check();
});

Creating harness:

bash

cargo fuzz init
cargo fuzz add my_target

Integration tips:

Use
```
arbitrary
```
crate for automatic struct deserialization
cargo-fuzz wraps libFuzzer, so all libFuzzer features work
Compile with sanitizers automatically via cargo-fuzz
Harnesses go in
```
fuzz/fuzz_targets/
```
directory

Running:

bash

cargo +nightly fuzz run my_target

Resources:

Harness签名：

rust

#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    // 你的代码
});

使用结构化输入（arbitrary crate）：

rust

#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: YourStruct| {
    data.check();
});

创建Harness：

bash

cargo fuzz init
cargo fuzz add my_target

集成技巧：

使用
```
arbitrary
```
crate 自动进行结构体反序列化
cargo-fuzz封装了libFuzzer，因此所有libFuzzer功能都可使用
通过cargo-fuzz自动启用sanitizer
Harness存放在
```
fuzz/fuzz_targets/
```
目录中

运行：

bash

cargo +nightly fuzz run my_target

资源：

go-fuzz

Harness signature:

// +build gofuzz

package mypackage

func Fuzz(data []byte) int {
    // Call target function
    target(data)

    // Return codes:
    // -1 if input is invalid
    //  0 if input is valid but not interesting
    //  1 if input is interesting (e.g., added new coverage)
    return 0
}

Building:

bash

go-fuzz-build

Integration tips:

Return 1 for inputs that add coverage (optional—fuzzer can detect automatically)
Return -1 for invalid inputs to deprioritize similar mutations
go-fuzz handles persistence automatically

Running:

bash

go-fuzz -bin=./mypackage-fuzz.zip -workdir=fuzz

Harness签名：

// +build gofuzz

package mypackage

func Fuzz(data []byte) int {
    // 调用目标函数
    target(data)

    // 返回码：
    // -1 表示输入无效
    // 0 表示输入有效但无意义
    // 1 表示输入有意义（例如新增了覆盖率）
    return 0
}

构建：

bash

go-fuzz-build

集成技巧：

对于新增覆盖率的输入返回1（可选——模糊测试器可自动检测）
对于无效输入返回-1以降低相似变异的优先级
go-fuzz自动处理持久化

运行：

bash

go-fuzz -bin=./mypackage-fuzz.zip -workdir=fuzz

Troubleshooting

故障排除

Issue	Cause	Solution
Low executions/sec	Harness is too slow (logging, I/O, complexity)	Profile harness, remove bottlenecks, mock I/O
No crashes found	Coverage not reaching buggy code	Check coverage, improve harness to reach more paths
Non-reproducible crashes	Non-determinism or global state	Remove randomness, reset globals between iterations
Fuzzer exits immediately	Harness calls `exit()`	Replace `exit()` with `abort()` or return error
Out of memory errors	Memory leaks in harness or SUT	Free allocations, use leak sanitizer to find leaks
Crashes on empty input	Harness doesn't validate size	Add `if (size < MIN_SIZE) return 0;`
Corpus not growing	Inputs too constrained or format too strict	Use FuzzedDataProvider or structure-aware fuzzing

问题	原因	解决方案
每秒执行次数低	Harness过慢（日志记录、I/O、复杂度高）	分析Harness性能，移除瓶颈，模拟I/O
未发现崩溃	覆盖率未达到有漏洞的代码	检查覆盖率，优化Harness以覆盖更多路径
崩溃无法复现	非确定性或全局状态	移除随机性，在迭代之间重置全局状态
模糊测试器立即退出	Harness调用了 `exit()`	将 `exit()` 替换为 `abort()` 或返回错误
内存不足错误	Harness或被测系统存在内存泄漏	释放分配的内存，使用泄漏sanitizer查找泄漏点
空输入导致崩溃	Harness未验证输入尺寸	添加 `if (size < MIN_SIZE) return 0;`
语料库未增长	输入限制过严或格式过于严格	使用FuzzedDataProvider或感知结构的模糊测试

Related Skills

Skill	How It Applies
libfuzzer	Uses `LLVMFuzzerTestOneInput` harness signature with FuzzedDataProvider
aflpp	Supports persistent mode harnesses with `__AFL_LOOP` for performance
cargo-fuzz	Uses Rust-specific `fuzz_target!` macro with arbitrary crate integration
atheris	Python harness takes bytes, calls Python functions
ossfuzz	Requires harnesses in specific directory structure for cloud fuzzing

技术	应用方式
libfuzzer	使用 `LLVMFuzzerTestOneInput` Harness签名和FuzzedDataProvider
aflpp	支持使用 `__AFL_LOOP` 的持久化模式Harness以提升性能
cargo-fuzz	使用Rust特定的 `fuzz_target!` 宏并集成arbitrary crate
atheris	Python Harness接收字节并调用Python函数
ossfuzz	要求Harness存放在特定目录结构中以进行云端模糊测试

Skill	Relationship
coverage-analysis	Measure harness effectiveness—are you reaching target code?
address-sanitizer	Detects bugs found by harness (buffer overflows, use-after-free)
fuzzing-dictionary	Provide tokens to help fuzzer pass format checks in harness
fuzzing-obstacles	Patch SUT when it violates harness rules (exit, non-determinism)

技术	关系
coverage-analysis	衡量Harness的有效性——是否覆盖了目标代码？
address-sanitizer	检测Harness发现的漏洞（缓冲区溢出、释放后使用）
fuzzing-dictionary	提供令牌以帮助模糊测试器通过Harness中的格式检查
fuzzing-obstacles	当被测系统违反Harness规则（调用exit()、非确定性）时对其进行修补

关键外部资源

Split Inputs in libFuzzer - Google Fuzzing Docs Explains techniques for handling multiple input parameters in a single fuzzing harness, including use of magic separators and FuzzedDataProvider.

Structure-Aware Fuzzing with Protocol Buffers Advanced technique using protobuf as intermediate format with custom mutators to ensure fuzzer mutates message contents rather than format encoding.

libFuzzer Documentation Official LLVM documentation covering harness requirements, best practices, and advanced features.

cargo-fuzz Book Comprehensive guide to writing Rust fuzzing harnesses with cargo-fuzz and the arbitrary crate.

libFuzzer中的输入拆分 - Google模糊测试文档 解释了在单个模糊测试Harness中处理多个输入参数的技术，包括使用魔术分隔符和FuzzedDataProvider。

使用Protocol Buffers进行感知结构的模糊测试 高级技术，使用protobuf作为中间格式并配合自定义变异器，确保模糊测试器变异消息内容而非格式编码。

libFuzzer文档 官方LLVM文档，涵盖Harness要求、最佳实践和高级功能。

cargo-fuzz手册 关于使用cargo-fuzz和arbitrary crate编写Rust模糊测试Harness的综合指南。

Video Resources

视频资源

Effective File Format Fuzzing - Conference talk on writing harnesses for file format parsers
Modern Fuzzing of C/C++ Projects - Tutorial covering harness design patterns

高效文件格式模糊测试 - 关于为文件格式解析器编写Harness的会议演讲
C/C++项目的现代模糊测试 - 涵盖Harness设计模式的教程

harness-writing

Original

Translation

Writing Fuzzing Harnesses

编写Fuzzing Harnesses

Overview

概述

Key Concepts

核心概念

When to Apply

适用场景

Quick Reference

快速参考

Step-by-Step

分步指南

Step 1: Identify Entry Points

步骤1：确定入口点

Step 2: Write Minimal Harness

步骤2：编写最简Harness

Step 3: Add Input Validation

步骤3：添加输入验证

Step 4: Structure the Input

步骤4：结构化输入

Step 5: Test and Iterate

步骤5：测试与迭代

Common Patterns

常见模式

Pattern: Beyond Byte Arrays—Casting to Integers

模式：字节数组之外——转换为整数

Pattern: FuzzedDataProvider for Complex Inputs

模式：使用FuzzedDataProvider处理复杂输入

Pattern: Interleaved Fuzzing

模式：交错模糊测试

Pattern: Structure-Aware Fuzzing with Arbitrary (Rust)

模式：使用Arbitrary进行感知结构的模糊测试（Rust）

Advanced Usage

高级用法

Tips and Tricks

技巧与窍门

Structure-Aware Fuzzing with Protocol Buffers

使用Protocol Buffers进行感知结构的模糊测试

Handling Non-Determinism

处理非确定性

Resetting Global State

重置全局状态

Practical Harness Rules

实用Harness规则

Anti-Patterns

反模式

Tool-Specific Guidance

工具特定指南

libFuzzer

libFuzzer

AFL++

AFL++

cargo-fuzz (Rust)

cargo-fuzz（Rust）

go-fuzz

go-fuzz

Troubleshooting

故障排除

Related Skills

相关技术

Tools That Use This Technique

使用此技术的工具

Related Techniques

相关技术

Resources

资源

Key External Resources

关键外部资源

Video Resources

视频资源