harness-writing

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Writing Fuzzing Harnesses

编写Fuzzing Harnesses

A fuzzing harness is the entrypoint function that receives random data from the fuzzer and routes it to your system under test (SUT). The quality of your harness directly determines which code paths get exercised and whether critical bugs are found. A poorly written harness can miss entire subsystems or produce non-reproducible crashes.
Fuzzing Harness是从模糊测试器接收随机数据并将其传递给被测系统(SUT)的入口函数。Harness的质量直接决定了哪些代码路径会被执行,以及是否能发现关键漏洞。编写不佳的Harness可能会遗漏整个子系统,或者导致无法复现的崩溃。

Overview

概述

The harness is the bridge between the fuzzer's random byte generation and your application's API. It must parse raw bytes into meaningful inputs, call target functions, and handle edge cases gracefully. The most important part of any fuzzing setup is the harness—if written poorly, critical parts of your application may not be covered.
Harness是模糊测试器的随机字节生成器与应用程序API之间的桥梁。它必须将原始字节解析为有意义的输入,调用目标函数,并优雅地处理边缘情况。任何模糊测试设置中最重要的部分就是Harness——如果编写得不好,应用程序的关键部分可能无法被覆盖。

Key Concepts

核心概念

ConceptDescription
HarnessFunction that receives fuzzer input and calls target code under test
SUTSystem Under Test—the code being fuzzed
Entry pointFunction signature required by the fuzzer (e.g.,
LLVMFuzzerTestOneInput
)
FuzzedDataProviderHelper class for structured extraction of typed data from raw bytes
DeterminismProperty that ensures same input always produces same behavior
Interleaved fuzzingSingle harness that exercises multiple operations based on input
概念描述
Harness接收模糊测试器输入并调用被测目标代码的函数
SUT被测系统(System Under Test)——即被模糊测试的代码
入口点(Entry point)模糊测试器要求的函数签名(例如
LLVMFuzzerTestOneInput
FuzzedDataProvider用于从原始字节中结构化提取类型化数据的辅助类
确定性(Determinism)确保相同输入始终产生相同行为的特性
交错模糊测试(Interleaved fuzzing)可根据输入执行多个操作的单一Harness

When to Apply

适用场景

Apply this technique when:
  • Creating a new fuzz target for the first time
  • Fuzz campaign has low code coverage or isn't finding bugs
  • Crashes found during fuzzing are not reproducible
  • Target API requires complex or structured inputs
  • Multiple related functions should be tested together
Skip this technique when:
  • Using existing well-tested harnesses from your project
  • Tool provides automatic harness generation that meets your needs
  • Target already has comprehensive fuzzing infrastructure
在以下场景中应用此技术:
  • 首次创建新的模糊测试目标时
  • 模糊测试活动的代码覆盖率低或未发现漏洞时
  • 模糊测试中发现的崩溃无法复现时
  • 目标API需要复杂或结构化的输入时
  • 需要同时测试多个相关函数时
在以下场景中跳过此技术:
  • 使用项目中已有的经过充分测试的Harness时
  • 工具提供的自动Harness生成功能满足需求时
  • 目标已具备全面的模糊测试基础设施时

Quick Reference

快速参考

TaskPattern
Minimal C++ harness
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size)
Minimal Rust harness`fuzz_target!(
Size validation
if (size < MIN_SIZE) return 0;
Cast to integers
uint32_t val = *(uint32_t*)(data);
Use FuzzedDataProvider
FuzzedDataProvider fuzzed_data(data, size);
Extract typed data (C++)
auto val = fuzzed_data.ConsumeIntegral<uint32_t>();
Extract string (C++)
auto str = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
任务实现模式
最简C++ Harness
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size)
最简Rust Harness`fuzz_target!(
大小验证
if (size < MIN_SIZE) return 0;
转换为整数
uint32_t val = *(uint32_t*)(data);
使用FuzzedDataProvider
FuzzedDataProvider fuzzed_data(data, size);
提取类型化数据(C++)
auto val = fuzzed_data.ConsumeIntegral<uint32_t>();
提取字符串(C++)
auto str = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);

Step-by-Step

分步指南

Step 1: Identify Entry Points

步骤1:确定入口点

Find functions in your codebase that:
  • Accept external input (parsers, validators, protocol handlers)
  • Parse complex data formats (JSON, XML, binary protocols)
  • Perform security-critical operations (authentication, cryptography)
  • Have high cyclomatic complexity or many branches
Good targets are typically:
  • Protocol parsers
  • File format parsers
  • Serialization/deserialization functions
  • Input validation routines
在代码库中找到以下函数:
  • 接收外部输入的函数(解析器、验证器、协议处理程序)
  • 解析复杂数据格式的函数(JSON、XML、二进制协议)
  • 执行安全关键操作的函数(身份验证、加密)
  • 圈复杂度高或分支较多的函数
理想的测试目标通常是:
  • 协议解析器
  • 文件格式解析器
  • 序列化/反序列化函数
  • 输入验证例程

Step 2: Write Minimal Harness

步骤2:编写最简Harness

Start with the simplest possible harness that calls your target function:
C/C++:
cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    target_function(data, size);
    return 0;
}
Rust:
rust
#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    target_function(data);
});
从调用目标函数的最简单Harness开始:
C/C++:
cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    target_function(data, size);
    return 0;
}
Rust:
rust
#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    target_function(data);
});

Step 3: Add Input Validation

步骤3:添加输入验证

Reject inputs that are too small or too large to be meaningful:
cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // Ensure minimum size for meaningful input
    if (size < MIN_INPUT_SIZE || size > MAX_INPUT_SIZE) {
        return 0;
    }
    target_function(data, size);
    return 0;
}
Rationale: The fuzzer generates random inputs of all sizes. Your harness must handle empty, tiny, huge, or malformed inputs without causing unexpected issues in the harness itself (crashes in the SUT are fine—that's what we're looking for).
拒绝过小或过大的无意义输入:
cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // 确保输入达到有意义的最小尺寸
    if (size < MIN_INPUT_SIZE || size > MAX_INPUT_SIZE) {
        return 0;
    }
    target_function(data, size);
    return 0;
}
原理: 模糊测试器会生成各种尺寸的随机输入。你的Harness必须能够处理空输入、极小输入、超大输入或格式错误的输入,且自身不会出现意外问题(被测系统中的崩溃是可接受的——这正是我们要寻找的)。

Step 4: Structure the Input

步骤4:结构化输入

For APIs that require typed data (integers, strings, etc.), use casting or helpers like
FuzzedDataProvider
:
Simple casting:
cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    if (size != 2 * sizeof(uint32_t)) {
        return 0;
    }

    uint32_t numerator = *(uint32_t*)(data);
    uint32_t denominator = *(uint32_t*)(data + sizeof(uint32_t));

    divide(numerator, denominator);
    return 0;
}
Using FuzzedDataProvider:
cpp
#include "FuzzedDataProvider.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    FuzzedDataProvider fuzzed_data(data, size);

    size_t allocation_size = fuzzed_data.ConsumeIntegral<size_t>();
    std::vector<char> str1 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
    std::vector<char> str2 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);

    concat(&str1[0], str1.size(), &str2[0], str2.size(), allocation_size);
    return 0;
}
对于需要类型化数据(整数、字符串等)的API,使用强制类型转换或
FuzzedDataProvider
等辅助工具:
简单强制类型转换:
cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    if (size != 2 * sizeof(uint32_t)) {
        return 0;
    }

    uint32_t numerator = *(uint32_t*)(data);
    uint32_t denominator = *(uint32_t*)(data + sizeof(uint32_t));

    divide(numerator, denominator);
    return 0;
}
使用FuzzedDataProvider:
cpp
#include "FuzzedDataProvider.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    FuzzedDataProvider fuzzed_data(data, size);

    size_t allocation_size = fuzzed_data.ConsumeIntegral<size_t>();
    std::vector<char> str1 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
    std::vector<char> str2 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);

    concat(&str1[0], str1.size(), &str2[0], str2.size(), allocation_size);
    return 0;
}

Step 5: Test and Iterate

步骤5:测试与迭代

Run the fuzzer and monitor:
  • Code coverage (are all interesting paths reached?)
  • Executions per second (is it fast enough?)
  • Crash reproducibility (can you reproduce crashes with saved inputs?)
Iterate on the harness to improve these metrics.
运行模糊测试器并监控以下指标:
  • 代码覆盖率(是否覆盖了所有重要路径?)
  • 每秒执行次数(速度是否足够快?)
  • 崩溃可复现性(能否使用保存的输入复现崩溃?)
迭代优化Harness以提升这些指标。

Common Patterns

常见模式

Pattern: Beyond Byte Arrays—Casting to Integers

模式:字节数组之外——转换为整数

Use Case: When target expects primitive types like integers or floats
Implementation:
cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // Ensure exactly 2 4-byte numbers
    if (size != 2 * sizeof(uint32_t)) {
        return 0;
    }

    // Split input into two integers
    uint32_t numerator = *(uint32_t*)(data);
    uint32_t denominator = *(uint32_t*)(data + sizeof(uint32_t));

    divide(numerator, denominator);
    return 0;
}
Rust equivalent:
rust
fuzz_target!(|data: &[u8]| {
    if data.len() != 2 * std::mem::size_of::<i32>() {
        return;
    }

    let numerator = i32::from_ne_bytes([data[0], data[1], data[2], data[3]]);
    let denominator = i32::from_ne_bytes([data[4], data[5], data[6], data[7]]);

    divide(numerator, denominator);
});
Why it works: Any 8-byte input is valid. The fuzzer learns that inputs must be exactly 8 bytes, and every bit flip produces a new, potentially interesting input.
适用场景: 目标函数期望整数或浮点数等基本类型时
实现:
cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // 确保输入恰好包含2个4字节数字
    if (size != 2 * sizeof(uint32_t)) {
        return 0;
    }

    // 将输入拆分为两个整数
    uint32_t numerator = *(uint32_t*)(data);
    uint32_t denominator = *(uint32_t*)(data + sizeof(uint32_t));

    divide(numerator, denominator);
    return 0;
}
Rust等效实现:
rust
fuzz_target!(|data: &[u8]| {
    if data.len() != 2 * std::mem::size_of::<i32>() {
        return;
    }

    let numerator = i32::from_ne_bytes([data[0], data[1], data[2], data[3]]);
    let denominator = i32::from_ne_bytes([data[4], data[5], data[6], data[7]]);

    divide(numerator, denominator);
});
优势: 任何8字节输入都是有效的。模糊测试器会学习到输入必须恰好为8字节,且每一位的翻转都会产生新的、可能有趣的输入。

Pattern: FuzzedDataProvider for Complex Inputs

模式:使用FuzzedDataProvider处理复杂输入

Use Case: When target requires multiple strings, integers, or variable-length data
Implementation:
cpp
#include "FuzzedDataProvider.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    FuzzedDataProvider fuzzed_data(data, size);

    // Extract different types of data
    size_t allocation_size = fuzzed_data.ConsumeIntegral<size_t>();

    // Consume variable-length strings with terminator
    std::vector<char> str1 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
    std::vector<char> str2 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);

    char* result = concat(&str1[0], str1.size(), &str2[0], str2.size(), allocation_size);
    if (result != NULL) {
        free(result);
    }

    return 0;
}
Why it helps:
FuzzedDataProvider
handles the complexity of extracting structured data from a byte stream. It's particularly useful for APIs that need multiple parameters of different types.
适用场景: 目标函数需要多个字符串、整数或可变长度数据时
实现:
cpp
#include "FuzzedDataProvider.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    FuzzedDataProvider fuzzed_data(data, size);

    // 提取不同类型的数据
    size_t allocation_size = fuzzed_data.ConsumeIntegral<size_t>();

    // 提取带终止符的可变长度字符串
    std::vector<char> str1 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);
    std::vector<char> str2 = fuzzed_data.ConsumeBytesWithTerminator<char>(32, 0xFF);

    char* result = concat(&str1[0], str1.size(), &str2[0], str2.size(), allocation_size);
    if (result != NULL) {
        free(result);
    }

    return 0;
}
优势:
FuzzedDataProvider
处理了从字节流中结构化提取数据的复杂性。对于需要多个不同类型参数的API来说特别有用。

Pattern: Interleaved Fuzzing

模式:交错模糊测试

Use Case: When multiple related operations should be tested in a single harness
Implementation:
cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    if (size < 1 + 2 * sizeof(int32_t)) {
        return 0;
    }

    // First byte selects operation
    uint8_t mode = data[0];

    // Next bytes are operands
    int32_t numbers[2];
    memcpy(numbers, data + 1, 2 * sizeof(int32_t));

    int32_t result = 0;
    switch (mode % 4) {
        case 0:
            result = add(numbers[0], numbers[1]);
            break;
        case 1:
            result = subtract(numbers[0], numbers[1]);
            break;
        case 2:
            result = multiply(numbers[0], numbers[1]);
            break;
        case 3:
            result = divide(numbers[0], numbers[1]);
            break;
    }

    // Prevent compiler from optimizing away the calls
    printf("%d", result);
    return 0;
}
Advantages:
  • Faster to write one harness than multiple individual harnesses
  • Single shared corpus means interesting inputs for one operation may be interesting for others
  • Can discover bugs in interactions between operations
When to use:
  • Operations share similar input types
  • Operations are logically related (e.g., arithmetic operations, CRUD operations)
  • Single corpus makes sense across all operations
适用场景: 需要在单个Harness中测试多个相关操作时
实现:
cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    if (size < 1 + 2 * sizeof(int32_t)) {
        return 0;
    }

    // 第一个字节选择操作类型
    uint8_t mode = data[0];

    // 后续字节为操作数
    int32_t numbers[2];
    memcpy(numbers, data + 1, 2 * sizeof(int32_t));

    int32_t result = 0;
    switch (mode % 4) {
        case 0:
            result = add(numbers[0], numbers[1]);
            break;
        case 1:
            result = subtract(numbers[0], numbers[1]);
            break;
        case 2:
            result = multiply(numbers[0], numbers[1]);
            break;
        case 3:
            result = divide(numbers[0], numbers[1]);
            break;
    }

    // 防止编译器优化掉函数调用
    printf("%d", result);
    return 0;
}
优势:
  • 编写一个Harness比编写多个独立Harness更快
  • 单一共享语料库意味着针对一个操作的有趣输入可能对其他操作也有用
  • 可以发现操作之间的交互漏洞
适用时机:
  • 操作使用相似的输入类型时
  • 操作在逻辑上相关时(例如算术运算、CRUD操作)
  • 单一语料库对所有操作都有意义时

Pattern: Structure-Aware Fuzzing with Arbitrary (Rust)

模式:使用Arbitrary进行感知结构的模糊测试(Rust)

Use Case: When fuzzing Rust code that uses custom structs
Implementation:
rust
use arbitrary::Arbitrary;

#[derive(Debug, Arbitrary)]
pub struct Name {
    data: String
}

impl Name {
    pub fn check_buf(&self) {
        let data = self.data.as_bytes();
        if data.len() > 0 && data[0] == b'a' {
            if data.len() > 1 && data[1] == b'b' {
                if data.len() > 2 && data[2] == b'c' {
                    process::abort();
                }
            }
        }
    }
}
Harness with arbitrary:
rust
#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: your_project::Name| {
    data.check_buf();
});
Add to Cargo.toml:
toml
[dependencies]
arbitrary = { version = "1", features = ["derive"] }
Why it helps: The
arbitrary
crate automatically handles deserialization of raw bytes into your Rust structs, reducing boilerplate and ensuring valid struct construction.
Limitation: The arbitrary crate doesn't offer reverse serialization, so you can't manually construct byte arrays that map to specific structs. This works best when starting from an empty corpus (fine for libFuzzer, problematic for AFL++).
适用场景: 模糊测试使用自定义结构体的Rust代码时
实现:
rust
use arbitrary::Arbitrary;

#[derive(Debug, Arbitrary)]
pub struct Name {
    data: String
}

impl Name {
    pub fn check_buf(&self) {
        let data = self.data.as_bytes();
        if data.len() > 0 && data[0] == b'a' {
            if data.len() > 1 && data[1] == b'b' {
                if data.len() > 2 && data[2] == b'c' {
                    process::abort();
                }
            }
        }
    }
}
使用Arbitrary的Harness:
rust
#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: your_project::Name| {
    data.check_buf();
});
添加到Cargo.toml:
toml
[dependencies]
arbitrary = { version = "1", features = ["derive"] }
优势:
arbitrary
crate 自动处理将原始字节反序列化为Rust结构体的过程,减少样板代码并确保结构体的有效构造。
局限性: arbitrary crate 不支持反向序列化,因此无法手动构造映射到特定结构体的字节数组。这在从空语料库开始时效果很好(适用于libFuzzer),但对于AFL++来说可能存在问题。

Advanced Usage

高级用法

Tips and Tricks

技巧与窍门

TipWhy It Helps
Start with parsersHigh bug density, clear entry points, easy to harness
Mock I/O operationsPrevents hangs from blocking I/O, enables determinism
Use FuzzedDataProviderSimplifies extraction of structured data from raw bytes
Reset global stateEnsures each iteration is independent and reproducible
Free resources in harnessPrevents memory exhaustion during long campaigns
Avoid logging in harnessLogging is slow—fuzzing needs 100s-1000s exec/sec
Test harness manually firstRun harness with known inputs before starting campaign
Check coverage earlyEnsure harness reaches expected code paths
技巧优势
从解析器开始漏洞密度高,入口点清晰,易于编写Harness
模拟I/O操作防止阻塞I/O导致的挂起,确保确定性
使用FuzzedDataProvider简化从原始字节中提取结构化数据的过程
重置全局状态确保每次迭代独立且可复现
在Harness中释放资源防止长时间测试活动中的内存耗尽
避免在Harness中记录日志日志记录速度慢——模糊测试需要每秒执行数百到数千次
先手动测试Harness在开始测试活动前,使用已知输入运行Harness
尽早检查覆盖率确保Harness覆盖了预期的代码路径

Structure-Aware Fuzzing with Protocol Buffers

使用Protocol Buffers进行感知结构的模糊测试

For highly structured input formats, consider using Protocol Buffers as an intermediate format with custom mutators:
cpp
// Define your input format in .proto file
// Use libprotobuf-mutator to generate valid mutations
// This ensures fuzzer mutates message contents, not the protobuf encoding itself
This approach is more setup but prevents the fuzzer from wasting time on unparseable inputs. See structure-aware fuzzing documentation for details.
对于高度结构化的输入格式,可以考虑使用Protocol Buffers作为中间格式,并配合自定义变异器:
cpp
// 在.proto文件中定义输入格式
// 使用libprotobuf-mutator生成有效的变异
// 确保模糊测试器变异消息内容,而非protobuf编码本身
这种方法设置更复杂,但可以防止模糊测试器在无法解析的输入上浪费时间。详情请参阅感知结构的模糊测试文档

Handling Non-Determinism

处理非确定性

Problem: Random values or timing dependencies cause non-reproducible crashes.
Solutions:
  • Replace
    rand()
    with deterministic PRNG seeded from fuzzer input:
    cpp
    uint32_t seed = fuzzed_data.ConsumeIntegral<uint32_t>();
    srand(seed);
  • Mock system calls that return time, PIDs, or random data
  • Avoid reading from
    /dev/random
    or
    /dev/urandom
问题: 随机值或时间依赖导致无法复现的崩溃。
解决方案:
  • 使用从模糊测试器输入中生成的种子替换
    rand()
    cpp
    uint32_t seed = fuzzed_data.ConsumeIntegral<uint32_t>();
    srand(seed);
  • 模拟返回时间、PID或随机数据的系统调用
  • 避免读取
    /dev/random
    /dev/urandom

Resetting Global State

重置全局状态

If your SUT uses global state (singletons, static variables), reset it between iterations:
cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // Reset global state before each iteration
    global_reset();

    target_function(data, size);

    // Clean up resources
    global_cleanup();
    return 0;
}
Rationale: Global state can cause crashes after N iterations rather than on a specific input, making bugs non-reproducible.
如果被测系统使用全局状态(单例、静态变量),请在每次迭代之间重置:
cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // 在每次迭代开始前重置全局状态
    global_reset();

    target_function(data, size);

    // 清理资源
    global_cleanup();
    return 0;
}
原理: 全局状态可能导致崩溃发生在N次迭代后,而非特定输入,从而使漏洞无法复现。

Practical Harness Rules

实用Harness规则

Follow these rules to ensure effective fuzzing harnesses:
RuleRationale
Handle all input sizesFuzzer generates empty, tiny, huge inputs—harness must handle gracefully
Never call
exit()
Calling
exit()
stops the fuzzer process. Use
abort()
in SUT if needed
Join all threadsEach iteration must run to completion before next iteration starts
Be fastAim for 100s-1000s executions/sec. Avoid logging, high complexity, excess memory
Maintain determinismSame input must always produce same behavior for reproducibility
Avoid global stateGlobal state reduces reproducibility—reset between iterations if unavoidable
Use narrow targetsDon't fuzz PNG and TCP in same harness—different formats need separate targets
Free resourcesPrevent memory leaks that cause resource exhaustion during long campaigns
Note: These guidelines apply not just to harness code, but to the entire SUT. If the SUT violates these rules, consider patching it (see the fuzzing obstacles technique).
遵循以下规则以确保高效的模糊测试Harness:
规则原理
处理所有输入尺寸模糊测试器会生成空输入、极小输入、超大输入——Harness必须优雅处理
切勿调用
exit()
调用
exit()
会终止模糊测试器进程。如果需要,可在被测系统中使用
abort()
等待所有线程结束每次迭代必须在下次迭代开始前完成执行
保持快速目标为每秒执行数百到数千次。避免日志记录、高复杂度操作和过多内存使用
保持确定性相同输入必须始终产生相同行为,以确保可复现性
避免全局状态全局状态会降低可复现性——如果无法避免,请在迭代之间重置
使用窄范围目标不要在同一个Harness中同时模糊测试PNG和TCP——不同格式需要单独的测试目标
释放资源防止长时间测试活动中因内存泄漏导致的资源耗尽
注意: 这些准则不仅适用于Harness代码,也适用于整个被测系统。如果被测系统违反这些规则,请考虑对其进行修补(请参阅模糊测试障碍技术)。

Anti-Patterns

反模式

Anti-PatternProblemCorrect Approach
Global state without resetNon-deterministic crashesReset all globals at start of harness
Blocking I/O or network callsHangs fuzzer, wastes timeMock I/O, use in-memory buffers
Memory leaks in harnessResource exhaustion kills campaignFree all allocations before returning
Calling
exit()
in SUT
Stops entire fuzzing processUse
abort()
or return error codes
Heavy logging in harnessReduces exec/sec by orders of magnitudeDisable logging during fuzzing
Too many operations per iterationSlows down fuzzerKeep iterations fast and focused
Mixing unrelated input formatsCorpus entries not useful across formatsSeparate harnesses for different formats
Not validating input sizeHarness crashes on edge casesCheck
size
before accessing
data
反模式问题正确做法
全局状态未重置非确定性崩溃在Harness开始时重置所有全局状态
阻塞I/O或网络调用导致模糊测试器挂起,浪费时间模拟I/O操作,使用内存缓冲区
Harness中的内存泄漏资源耗尽导致测试活动终止在返回前释放所有分配的内存
在被测系统中调用
exit()
终止整个模糊测试进程使用
abort()
或返回错误码
Harness中大量日志记录导致每秒执行次数大幅下降模糊测试期间禁用日志记录
每次迭代执行过多操作减慢模糊测试器速度保持迭代快速且聚焦
混合不相关的输入格式语料库条目对其他格式无用为不同格式使用单独的Harness
未验证输入尺寸Harness在边缘情况中崩溃在访问
data
前检查
size

Tool-Specific Guidance

工具特定指南

libFuzzer

libFuzzer

Harness signature:
cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // Your code here
    return 0;  // Non-zero return is reserved for future use
}
Compilation:
bash
clang++ -fsanitize=fuzzer,address -g harness.cc -o fuzz_target
Integration tips:
  • Use
    FuzzedDataProvider.h
    for structured input extraction
  • Compile with
    -fsanitize=fuzzer
    to link the fuzzing runtime
  • Add sanitizers (
    -fsanitize=address,undefined
    ) to detect more bugs
  • Use
    -g
    for better stack traces when crashes occur
  • libFuzzer can start with empty corpus—no seed inputs required
Running:
bash
./fuzz_target corpus_dir/
Resources:
Harness签名:
cpp
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    // 你的代码
    return 0;  // 非零返回值预留供未来使用
}
编译:
bash
clang++ -fsanitize=fuzzer,address -g harness.cc -o fuzz_target
集成技巧:
  • 使用
    FuzzedDataProvider.h
    进行结构化输入提取
  • 编译时添加
    -fsanitize=fuzzer
    以链接模糊测试运行时
  • 添加 sanitizer(
    -fsanitize=address,undefined
    )以检测更多漏洞
  • 使用
    -g
    以在崩溃时获得更好的堆栈跟踪
  • libFuzzer可从空语料库开始——无需种子输入
运行:
bash
./fuzz_target corpus_dir/
资源:

AFL++

AFL++

AFL++ supports multiple harness styles. For best performance, use persistent mode:
Persistent mode harness:
cpp
#include <unistd.h>

int main(int argc, char **argv) {
    #ifdef __AFL_HAVE_MANUAL_CONTROL
        __AFL_INIT();
    #endif

    unsigned char buf[MAX_SIZE];

    while (__AFL_LOOP(10000)) {
        // Read input from stdin
        ssize_t len = read(0, buf, sizeof(buf));
        if (len <= 0) break;

        // Call target function
        target_function(buf, len);
    }

    return 0;
}
Compilation:
bash
afl-clang-fast++ -g harness.cc -o fuzz_target
Integration tips:
  • Use persistent mode (
    __AFL_LOOP
    ) for 10-100x speedup
  • Consider deferred initialization (
    __AFL_INIT()
    ) to skip setup overhead
  • AFL++ requires at least one seed input in the corpus directory
  • Use
    AFL_USE_ASAN=1
    or
    AFL_USE_UBSAN=1
    for sanitizer builds
Running:
bash
afl-fuzz -i seeds/ -o findings/ -- ./fuzz_target
AFL++支持多种Harness风格。为获得最佳性能,请使用持久化模式:
持久化模式Harness:
cpp
#include <unistd.h>

int main(int argc, char **argv) {
    #ifdef __AFL_HAVE_MANUAL_CONTROL
        __AFL_INIT();
    #endif

    unsigned char buf[MAX_SIZE];

    while (__AFL_LOOP(10000)) {
        // 从标准输入读取输入
        ssize_t len = read(0, buf, sizeof(buf));
        if (len <= 0) break;

        // 调用目标函数
        target_function(buf, len);
    }

    return 0;
}
编译:
bash
afl-clang-fast++ -g harness.cc -o fuzz_target
集成技巧:
  • 使用持久化模式(
    __AFL_LOOP
    )可获得10-100倍的速度提升
  • 考虑使用延迟初始化(
    __AFL_INIT()
    )以跳过设置开销
  • AFL++要求语料库目录中至少有一个种子输入
  • 使用
    AFL_USE_ASAN=1
    AFL_USE_UBSAN=1
    启用sanitizer构建
运行:
bash
afl-fuzz -i seeds/ -o findings/ -- ./fuzz_target

cargo-fuzz (Rust)

cargo-fuzz(Rust)

Harness signature:
rust
#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    // Your code here
});
With structured input (arbitrary crate):
rust
#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: YourStruct| {
    data.check();
});
Creating harness:
bash
cargo fuzz init
cargo fuzz add my_target
Integration tips:
  • Use
    arbitrary
    crate for automatic struct deserialization
  • cargo-fuzz wraps libFuzzer, so all libFuzzer features work
  • Compile with sanitizers automatically via cargo-fuzz
  • Harnesses go in
    fuzz/fuzz_targets/
    directory
Running:
bash
cargo +nightly fuzz run my_target
Resources:
Harness签名:
rust
#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    // 你的代码
});
使用结构化输入(arbitrary crate):
rust
#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: YourStruct| {
    data.check();
});
创建Harness:
bash
cargo fuzz init
cargo fuzz add my_target
集成技巧:
  • 使用
    arbitrary
    crate 自动进行结构体反序列化
  • cargo-fuzz封装了libFuzzer,因此所有libFuzzer功能都可使用
  • 通过cargo-fuzz自动启用sanitizer
  • Harness存放在
    fuzz/fuzz_targets/
    目录中
运行:
bash
cargo +nightly fuzz run my_target
资源:

go-fuzz

go-fuzz

Harness signature:
go
// +build gofuzz

package mypackage

func Fuzz(data []byte) int {
    // Call target function
    target(data)

    // Return codes:
    // -1 if input is invalid
    //  0 if input is valid but not interesting
    //  1 if input is interesting (e.g., added new coverage)
    return 0
}
Building:
bash
go-fuzz-build
Integration tips:
  • Return 1 for inputs that add coverage (optional—fuzzer can detect automatically)
  • Return -1 for invalid inputs to deprioritize similar mutations
  • go-fuzz handles persistence automatically
Running:
bash
go-fuzz -bin=./mypackage-fuzz.zip -workdir=fuzz
Harness签名:
go
// +build gofuzz

package mypackage

func Fuzz(data []byte) int {
    // 调用目标函数
    target(data)

    // 返回码:
    // -1 表示输入无效
    // 0 表示输入有效但无意义
    // 1 表示输入有意义(例如新增了覆盖率)
    return 0
}
构建:
bash
go-fuzz-build
集成技巧:
  • 对于新增覆盖率的输入返回1(可选——模糊测试器可自动检测)
  • 对于无效输入返回-1以降低相似变异的优先级
  • go-fuzz自动处理持久化
运行:
bash
go-fuzz -bin=./mypackage-fuzz.zip -workdir=fuzz

Troubleshooting

故障排除

IssueCauseSolution
Low executions/secHarness is too slow (logging, I/O, complexity)Profile harness, remove bottlenecks, mock I/O
No crashes foundCoverage not reaching buggy codeCheck coverage, improve harness to reach more paths
Non-reproducible crashesNon-determinism or global stateRemove randomness, reset globals between iterations
Fuzzer exits immediatelyHarness calls
exit()
Replace
exit()
with
abort()
or return error
Out of memory errorsMemory leaks in harness or SUTFree allocations, use leak sanitizer to find leaks
Crashes on empty inputHarness doesn't validate sizeAdd
if (size < MIN_SIZE) return 0;
Corpus not growingInputs too constrained or format too strictUse FuzzedDataProvider or structure-aware fuzzing
问题原因解决方案
每秒执行次数低Harness过慢(日志记录、I/O、复杂度高)分析Harness性能,移除瓶颈,模拟I/O
未发现崩溃覆盖率未达到有漏洞的代码检查覆盖率,优化Harness以覆盖更多路径
崩溃无法复现非确定性或全局状态移除随机性,在迭代之间重置全局状态
模糊测试器立即退出Harness调用了
exit()
exit()
替换为
abort()
或返回错误
内存不足错误Harness或被测系统存在内存泄漏释放分配的内存,使用泄漏sanitizer查找泄漏点
空输入导致崩溃Harness未验证输入尺寸添加
if (size < MIN_SIZE) return 0;
语料库未增长输入限制过严或格式过于严格使用FuzzedDataProvider或感知结构的模糊测试

Related Skills

相关技术

Tools That Use This Technique

使用此技术的工具

SkillHow It Applies
libfuzzerUses
LLVMFuzzerTestOneInput
harness signature with FuzzedDataProvider
aflppSupports persistent mode harnesses with
__AFL_LOOP
for performance
cargo-fuzzUses Rust-specific
fuzz_target!
macro with arbitrary crate integration
atherisPython harness takes bytes, calls Python functions
ossfuzzRequires harnesses in specific directory structure for cloud fuzzing
技术应用方式
libfuzzer使用
LLVMFuzzerTestOneInput
Harness签名和FuzzedDataProvider
aflpp支持使用
__AFL_LOOP
的持久化模式Harness以提升性能
cargo-fuzz使用Rust特定的
fuzz_target!
宏并集成arbitrary crate
atherisPython Harness接收字节并调用Python函数
ossfuzz要求Harness存放在特定目录结构中以进行云端模糊测试

Related Techniques

相关技术

SkillRelationship
coverage-analysisMeasure harness effectiveness—are you reaching target code?
address-sanitizerDetects bugs found by harness (buffer overflows, use-after-free)
fuzzing-dictionaryProvide tokens to help fuzzer pass format checks in harness
fuzzing-obstaclesPatch SUT when it violates harness rules (exit, non-determinism)
技术关系
coverage-analysis衡量Harness的有效性——是否覆盖了目标代码?
address-sanitizer检测Harness发现的漏洞(缓冲区溢出、释放后使用)
fuzzing-dictionary提供令牌以帮助模糊测试器通过Harness中的格式检查
fuzzing-obstacles当被测系统违反Harness规则(调用exit()、非确定性)时对其进行修补

Resources

资源

Key External Resources

关键外部资源

Split Inputs in libFuzzer - Google Fuzzing Docs Explains techniques for handling multiple input parameters in a single fuzzing harness, including use of magic separators and FuzzedDataProvider.
Structure-Aware Fuzzing with Protocol Buffers Advanced technique using protobuf as intermediate format with custom mutators to ensure fuzzer mutates message contents rather than format encoding.
libFuzzer Documentation Official LLVM documentation covering harness requirements, best practices, and advanced features.
cargo-fuzz Book Comprehensive guide to writing Rust fuzzing harnesses with cargo-fuzz and the arbitrary crate.
libFuzzer中的输入拆分 - Google模糊测试文档 解释了在单个模糊测试Harness中处理多个输入参数的技术,包括使用魔术分隔符和FuzzedDataProvider。
使用Protocol Buffers进行感知结构的模糊测试 高级技术,使用protobuf作为中间格式并配合自定义变异器,确保模糊测试器变异消息内容而非格式编码。
libFuzzer文档 官方LLVM文档,涵盖Harness要求、最佳实践和高级功能。
cargo-fuzz手册 关于使用cargo-fuzz和arbitrary crate编写Rust模糊测试Harness的综合指南。

Video Resources

视频资源