test-data-strategy

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Test Data Strategy

测试数据策略

When to Use This Skill

何时使用此技能

Use this skill when:
  • Test Data Strategy tasks - Working on plan comprehensive test data management including synthetic data generation, data anonymization, versioning, and environment-specific strategies
  • Planning or design - Need guidance on Test Data Strategy approaches
  • Best practices - Want to follow established patterns and standards
在以下场景中使用此技能:
  • 测试数据策略任务 - 开展全面的测试数据管理规划,涵盖合成数据生成、数据匿名化、版本控制以及针对特定环境的策略
  • 规划或设计阶段 - 需要测试数据策略方法的指导
  • 最佳实践参考 - 希望遵循已确立的模式和标准

Overview

概述

Effective test data management ensures tests have the right data at the right time while protecting sensitive information and maintaining data quality across environments.
有效的测试数据管理确保测试在正确的时间获取到合适的数据,同时保护敏感信息,并在各个环境中维持数据质量。

Test Data Types

测试数据类型

TypeSourceUse CasePrivacy Risk
SyntheticGeneratedUnit/Integration testsNone
SubsetProduction samplePerformance testingMedium
MaskedAnonymized productionRealistic scenariosLow
Production CloneFull copyPre-prod validationHigh
BaselineCurated referenceRegression testingLow
类型来源使用场景隐私风险
合成数据生成单元/集成测试
子集数据生产环境样本性能测试中等
掩码数据匿名化生产数据真实场景测试
生产环境克隆完整副本预生产验证
基准数据精选参考数据回归测试

Test Data Strategy Template

测试数据策略模板

markdown
undefined
markdown
undefined

Test Data Strategy: [Project Name]

测试数据策略: [项目名称]

1. Data Requirements

1. 数据需求

By Test Level

按测试层级划分

LevelData SourceVolumeRefresh
UnitSyntheticMinimalOn-demand
IntegrationSynthetic/SubsetModeratePer run
SystemMasked productionRealisticWeekly
PerformanceScaled syntheticProduction-likePer release
层级数据来源量级刷新频率
单元测试合成数据最小量级按需生成
集成测试合成数据/子集数据中等量级每次运行时刷新
系统测试掩码生产数据真实量级每周刷新
性能测试规模化合成数据接近生产环境量级每次发布前刷新

By Feature Area

按功能模块划分

FeatureCritical DataVolume RequiredSensitivity
AuthenticationUser accounts1000High
PaymentsTransactions10000High
ReportingHistorical data1M recordsMedium
功能关键数据所需量级敏感度
身份验证用户账户1000条
支付功能交易记录10000条
报表功能历史数据100万条记录中等

2. Data Generation Strategy

2. 数据生成策略

Synthetic Data Tools

合成数据工具

  • Unit Tests: AutoFixture, Bogus
  • Integration: TestContainers + Seed
  • Performance: Bulk generators
  • 单元测试: AutoFixture, Bogus
  • 集成测试: TestContainers + Seed
  • 性能测试: 批量生成工具

Generation Rules

生成规则

EntityKey FieldsGeneration Logic
UserEmail
{guid}@test.example.com
OrderAmount
Random(1, 10000)
DateTimestamp
Random(now-1y, now)
实体关键字段生成逻辑
用户邮箱
{guid}@test.example.com
订单金额
Random(1, 10000)
日期时间戳
Random(now-1y, now)

3. Data Anonymization

3. 数据匿名化

PII Fields

PII字段

FieldOriginalAnonymization Method
NameJohn SmithFaker generated
Emailjohn@acme.com
hash@domain.test
Phone555-123-4567
555-xxx-xxxx
SSN123-45-6789
xxx-xx-xxxx
Address123 Main StFaker address
DOB1985-03-15Shift by random days
字段原始值匿名化方法
姓名John SmithFaker生成虚假数据
邮箱john@acme.com
hash@domain.test
电话555-123-4567
555-xxx-xxxx
社保号123-45-6789
xxx-xx-xxxx
地址123 Main StFaker生成虚假地址
出生日期1985-03-15随机偏移天数

Anonymization Rules

匿名化规则

  • Preserve data relationships
  • Maintain referential integrity
  • Keep statistical properties
  • Remove unique identifiers
  • 保留数据关联关系
  • 维持引用完整性
  • 保留统计属性
  • 移除唯一标识符

4. Environment Strategy

4. 环境策略

Dev Environment

开发环境

  • Source: 100% synthetic
  • Refresh: On-demand
  • Volume: Minimal
  • 来源: 100%合成数据
  • 刷新频率: 按需生成
  • 量级: 最小量级

QA Environment

QA环境

  • Source: Masked production subset
  • Refresh: Weekly
  • Volume: 10% of production
  • 来源: 掩码生产数据子集
  • 刷新频率: 每周
  • 量级: 生产环境的10%

Staging Environment

预发布环境

  • Source: Masked production clone
  • Refresh: Before each release
  • Volume: 100% of production
  • 来源: 掩码生产环境克隆
  • 刷新频率: 每次发布前
  • 量级: 100%生产环境量级

Performance Environment

性能测试环境

  • Source: Scaled synthetic
  • Refresh: Before performance runs
  • Volume: 150% of production
  • 来源: 规模化合成数据
  • 刷新频率: 性能测试前
  • 量级: 生产环境的150%

5. Data Versioning

5. 数据版本控制

Baseline Management

基准数据管理

  • Version baseline data sets
  • Track data schema changes
  • Maintain backward compatibility
  • Document data dependencies
  • 对基准数据集进行版本控制
  • 跟踪数据架构变更
  • 维持向后兼容性
  • 记录数据依赖关系

Refresh Procedures

刷新流程

  1. Trigger: [Manual/Scheduled/Event]
  2. Source: [Production/Backup/Generator]
  3. Transform: [Anonymization steps]
  4. Load: [Target environment]
  5. Validate: [Verification checks]
  1. 触发方式: [手动/定时/事件驱动]
  2. 数据来源: [生产环境/备份/生成工具]
  3. 转换处理: [匿名化步骤]
  4. 加载目标: [目标环境]
  5. 验证: [校验检查]

6. Compliance Requirements

6. 合规要求

GDPR Compliance

GDPR合规

  • No real EU citizen data in non-prod
  • Right to erasure supported
  • Data minimization applied
  • Consent tracking anonymized
  • 非生产环境中无真实欧盟公民数据
  • 支持数据删除权
  • 应用数据最小化原则
  • 匿名化同意记录

HIPAA Compliance

HIPAA合规

  • PHI fully de-identified
  • Safe Harbor method applied
  • Audit logs maintained
  • Access controls verified
undefined
  • 受保护健康信息(PHI)完全去标识化
  • 应用安全港方法
  • 保留审计日志
  • 验证访问控制
undefined

Synthetic Data Generation (.NET)

合成数据生成 (.NET)

Using Bogus

使用Bogus

csharp
using Bogus;

public class TestDataGenerator
{
    public static Faker<Customer> CustomerFaker => new Faker<Customer>()
        .RuleFor(c => c.Id, f => f.Random.Guid())
        .RuleFor(c => c.FirstName, f => f.Person.FirstName)
        .RuleFor(c => c.LastName, f => f.Person.LastName)
        .RuleFor(c => c.Email, (f, c) => f.Internet.Email(c.FirstName, c.LastName))
        .RuleFor(c => c.Phone, f => f.Phone.PhoneNumber())
        .RuleFor(c => c.DateOfBirth, f => f.Date.Past(50, DateTime.Now.AddYears(-18)))
        .RuleFor(c => c.Address, f => new Address
        {
            Street = f.Address.StreetAddress(),
            City = f.Address.City(),
            State = f.Address.StateAbbr(),
            Zip = f.Address.ZipCode()
        });

    public static Faker<Order> OrderFaker(Customer customer) => new Faker<Order>()
        .RuleFor(o => o.Id, f => f.Random.Guid())
        .RuleFor(o => o.CustomerId, customer.Id)
        .RuleFor(o => o.OrderDate, f => f.Date.Recent(30))
        .RuleFor(o => o.Total, f => f.Finance.Amount(10, 1000))
        .RuleFor(o => o.Status, f => f.PickRandom<OrderStatus>());
}
csharp
using Bogus;

public class TestDataGenerator
{
    public static Faker<Customer> CustomerFaker => new Faker<Customer>()
        .RuleFor(c => c.Id, f => f.Random.Guid())
        .RuleFor(c => c.FirstName, f => f.Person.FirstName)
        .RuleFor(c => c.LastName, f => f.Person.LastName)
        .RuleFor(c => c.Email, (f, c) => f.Internet.Email(c.FirstName, c.LastName))
        .RuleFor(c => c.Phone, f => f.Phone.PhoneNumber())
        .RuleFor(c => c.DateOfBirth, f => f.Date.Past(50, DateTime.Now.AddYears(-18)))
        .RuleFor(c => c.Address, f => new Address
        {
            Street = f.Address.StreetAddress(),
            City = f.Address.City(),
            State = f.Address.StateAbbr(),
            Zip = f.Address.ZipCode()
        });

    public static Faker<Order> OrderFaker(Customer customer) => new Faker<Order>()
        .RuleFor(o => o.Id, f => f.Random.Guid())
        .RuleFor(o => o.CustomerId, customer.Id)
        .RuleFor(o => o.OrderDate, f => f.Date.Recent(30))
        .RuleFor(o => o.Total, f => f.Finance.Amount(10, 1000))
        .RuleFor(o => o.Status, f => f.PickRandom<OrderStatus>());
}

Using AutoFixture

使用AutoFixture

csharp
using AutoFixture;
using AutoFixture.Xunit2;

public class CustomerTests
{
    [Theory, AutoData]
    public void CreateCustomer_WithValidData_Succeeds(Customer customer)
    {
        // AutoFixture generates valid Customer automatically
        var result = _service.Create(customer);
        Assert.True(result.IsSuccess);
    }

    [Theory, AutoData]
    public void ProcessOrder_CalculatesCorrectTotal(
        [Frozen] Customer customer,
        Order order,
        List<OrderItem> items)
    {
        // Frozen ensures customer is reused
        // Order and items are auto-generated
        order.Items = items;
        var total = _calculator.Calculate(order);
        Assert.Equal(items.Sum(i => i.Quantity * i.Price), total);
    }
}
csharp
using AutoFixture;
using AutoFixture.Xunit2;

public class CustomerTests
{
    [Theory, AutoData]
    public void CreateCustomer_WithValidData_Succeeds(Customer customer)
    {
        // AutoFixture自动生成有效的Customer实例
        var result = _service.Create(customer);
        Assert.True(result.IsSuccess);
    }

    [Theory, AutoData]
    public void ProcessOrder_CalculatesCorrectTotal(
        [Frozen] Customer customer,
        Order order,
        List<OrderItem> items)
    {
        // Frozen确保customer实例被复用
        // Order和items会自动生成
        order.Items = items;
        var total = _calculator.Calculate(order);
        Assert.Equal(items.Sum(i => i.Quantity * i.Price), total);
    }
}

Seeding Test Databases

测试数据库初始化

csharp
public class TestDatabaseSeeder
{
    public static async Task SeedAsync(AppDbContext context)
    {
        // Clear existing data
        await context.Database.ExecuteSqlRawAsync("DELETE FROM Orders");
        await context.Database.ExecuteSqlRawAsync("DELETE FROM Customers");

        // Generate test data
        var customers = TestDataGenerator.CustomerFaker.Generate(100);
        await context.Customers.AddRangeAsync(customers);

        foreach (var customer in customers)
        {
            var orders = TestDataGenerator.OrderFaker(customer).Generate(5);
            await context.Orders.AddRangeAsync(orders);
        }

        await context.SaveChangesAsync();
    }
}
csharp
public class TestDatabaseSeeder
{
    public static async Task SeedAsync(AppDbContext context)
    {
        // 清除现有数据
        await context.Database.ExecuteSqlRawAsync("DELETE FROM Orders");
        await context.Database.ExecuteSqlRawAsync("DELETE FROM Customers");

        // 生成测试数据
        var customers = TestDataGenerator.CustomerFaker.Generate(100);
        await context.Customers.AddRangeAsync(customers);

        foreach (var customer in customers)
        {
            var orders = TestDataGenerator.OrderFaker(customer).Generate(5);
            await context.Orders.AddRangeAsync(orders);
        }

        await context.SaveChangesAsync();
    }
}

Data Anonymization Techniques

数据匿名化技术

TechniqueDescriptionUse Case
SubstitutionReplace with fake dataNames, emails
ShufflingRearrange within columnSalaries, dates
MaskingPartial hidingSSN (xxx-xx-1234)
GeneralizationReduce precisionAge ranges, zip prefix
NullingRemove entirelyUnnecessary fields
TokenizationReplace with tokenCross-reference needs
HashingOne-way transformIdentifiers
技术描述使用场景
替换法用虚假数据替换姓名、邮箱
洗牌法在列内重新排列数据薪资、日期
掩码法部分隐藏数据社保号(xxx-xx-1234)
概化法降低数据精度年龄范围、邮编前缀
空值法完全移除数据非必要字段
令牌化用令牌替换数据需要交叉引用的场景
哈希法单向转换标识符

.NET Anonymization Example

.NET匿名化示例

csharp
public class DataAnonymizer
{
    public Customer Anonymize(Customer source)
    {
        return new Customer
        {
            Id = source.Id, // Preserve for relationships
            FirstName = _faker.Person.FirstName,
            LastName = _faker.Person.LastName,
            Email = $"{Guid.NewGuid():N}@test.example.com",
            Phone = MaskPhone(source.Phone),
            SSN = "xxx-xx-" + source.SSN.Substring(7, 4),
            DateOfBirth = ShiftDate(source.DateOfBirth),
            Address = new Address
            {
                Street = _faker.Address.StreetAddress(),
                City = source.Address.City, // Preserve geography
                State = source.Address.State,
                Zip = source.Address.Zip.Substring(0, 3) + "00"
            }
        };
    }

    private string MaskPhone(string phone)
    {
        // Keep area code, mask rest
        return Regex.Replace(phone, @"(\d{3})\d{3}(\d{4})", "$1-xxx-$2");
    }

    private DateTime ShiftDate(DateTime date)
    {
        // Shift by random days within ±30
        return date.AddDays(_random.Next(-30, 30));
    }
}
csharp
public class DataAnonymizer
{
    public Customer Anonymize(Customer source)
    {
        return new Customer
        {
            Id = source.Id, // 保留以维持关联关系
            FirstName = _faker.Person.FirstName,
            LastName = _faker.Person.LastName,
            Email = $"{Guid.NewGuid():N}@test.example.com",
            Phone = MaskPhone(source.Phone),
            SSN = "xxx-xx-" + source.SSN.Substring(7, 4),
            DateOfBirth = ShiftDate(source.DateOfBirth),
            Address = new Address
            {
                Street = _faker.Address.StreetAddress(),
                City = source.Address.City, // 保留地理信息
                State = source.Address.State,
                Zip = source.Address.Zip.Substring(0, 3) + "00"
            }
        };
    }

    private string MaskPhone(string phone)
    {
        // 保留区号,掩码其余部分
        return Regex.Replace(phone, @"(\d{3})\d{3}(\d{4})", "$1-xxx-$2");
    }

    private DateTime ShiftDate(DateTime date)
    {
        // 随机偏移±30天内的日期
        return date.AddDays(_random.Next(-30, 30));
    }
}

Test Data Patterns

测试数据模式

Builder Pattern

建造者模式

csharp
public class CustomerBuilder
{
    private Customer _customer = new();

    public CustomerBuilder WithName(string first, string last)
    {
        _customer.FirstName = first;
        _customer.LastName = last;
        return this;
    }

    public CustomerBuilder WithPremiumStatus()
    {
        _customer.IsPremium = true;
        _customer.PremiumSince = DateTime.Now.AddYears(-1);
        return this;
    }

    public CustomerBuilder WithOrders(int count)
    {
        _customer.Orders = TestDataGenerator.OrderFaker(_customer).Generate(count);
        return this;
    }

    public Customer Build() => _customer;
}

// Usage
var customer = new CustomerBuilder()
    .WithName("Test", "User")
    .WithPremiumStatus()
    .WithOrders(5)
    .Build();
csharp
public class CustomerBuilder
{
    private Customer _customer = new();

    public CustomerBuilder WithName(string first, string last)
    {
        _customer.FirstName = first;
        _customer.LastName = last;
        return this;
    }

    public CustomerBuilder WithPremiumStatus()
    {
        _customer.IsPremium = true;
        _customer.PremiumSince = DateTime.Now.AddYears(-1);
        return this;
    }

    public CustomerBuilder WithOrders(int count)
    {
        _customer.Orders = TestDataGenerator.OrderFaker(_customer).Generate(count);
        return this;
    }

    public Customer Build() => _customer;
}

// 使用示例
var customer = new CustomerBuilder()
    .WithName("Test", "User")
    .WithPremiumStatus()
    .WithOrders(5)
    .Build();

Object Mother Pattern

对象母版模式

csharp
public static class TestCustomers
{
    public static Customer ValidCustomer() => new()
    {
        Id = Guid.NewGuid(),
        FirstName = "Test",
        LastName = "User",
        Email = "test@example.com",
        Status = CustomerStatus.Active
    };

    public static Customer PremiumCustomer() => new()
    {
        Id = Guid.NewGuid(),
        FirstName = "Premium",
        LastName = "User",
        Email = "premium@example.com",
        IsPremium = true,
        Status = CustomerStatus.Active
    };

    public static Customer InactiveCustomer() => new()
    {
        Id = Guid.NewGuid(),
        Status = CustomerStatus.Inactive
    };
}
csharp
public static class TestCustomers
{
    public static Customer ValidCustomer() => new()
    {
        Id = Guid.NewGuid(),
        FirstName = "Test",
        LastName = "User",
        Email = "test@example.com",
        Status = CustomerStatus.Active
    };

    public static Customer PremiumCustomer() => new()
    {
        Id = Guid.NewGuid(),
        FirstName = "Premium",
        LastName = "User",
        Email = "premium@example.com",
        IsPremium = true,
        Status = CustomerStatus.Active
    };

    public static Customer InactiveCustomer() => new()
    {
        Id = Guid.NewGuid(),
        Status = CustomerStatus.Inactive
    };
}

Integration Points

集成对接

Inputs from:
  • Data model → Test data structure
  • Privacy requirements → Anonymization rules
  • test-strategy-planning
    skill → Data volume needs
Outputs to:
  • Test automation → Data fixtures
  • performance-test-planning
    skill → Load data
  • Environment provisioning → Seed scripts
输入来源:
  • 数据模型 → 测试数据结构
  • 隐私要求 → 匿名化规则
  • test-strategy-planning
    技能 → 数据量级需求
输出对接:
  • 测试自动化 → 数据夹具
  • performance-test-planning
    技能 → 负载数据
  • 环境部署 → 初始化脚本