test-data-strategy
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTest Data Strategy
测试数据策略
When to Use This Skill
何时使用此技能
Use this skill when:
- Test Data Strategy tasks - Working on plan comprehensive test data management including synthetic data generation, data anonymization, versioning, and environment-specific strategies
- Planning or design - Need guidance on Test Data Strategy approaches
- Best practices - Want to follow established patterns and standards
在以下场景中使用此技能:
- 测试数据策略任务 - 开展全面的测试数据管理规划,涵盖合成数据生成、数据匿名化、版本控制以及针对特定环境的策略
- 规划或设计阶段 - 需要测试数据策略方法的指导
- 最佳实践参考 - 希望遵循已确立的模式和标准
Overview
概述
Effective test data management ensures tests have the right data at the right time while protecting sensitive information and maintaining data quality across environments.
有效的测试数据管理确保测试在正确的时间获取到合适的数据,同时保护敏感信息,并在各个环境中维持数据质量。
Test Data Types
测试数据类型
| Type | Source | Use Case | Privacy Risk |
|---|---|---|---|
| Synthetic | Generated | Unit/Integration tests | None |
| Subset | Production sample | Performance testing | Medium |
| Masked | Anonymized production | Realistic scenarios | Low |
| Production Clone | Full copy | Pre-prod validation | High |
| Baseline | Curated reference | Regression testing | Low |
| 类型 | 来源 | 使用场景 | 隐私风险 |
|---|---|---|---|
| 合成数据 | 生成 | 单元/集成测试 | 无 |
| 子集数据 | 生产环境样本 | 性能测试 | 中等 |
| 掩码数据 | 匿名化生产数据 | 真实场景测试 | 低 |
| 生产环境克隆 | 完整副本 | 预生产验证 | 高 |
| 基准数据 | 精选参考数据 | 回归测试 | 低 |
Test Data Strategy Template
测试数据策略模板
markdown
undefinedmarkdown
undefinedTest Data Strategy: [Project Name]
测试数据策略: [项目名称]
1. Data Requirements
1. 数据需求
By Test Level
按测试层级划分
| Level | Data Source | Volume | Refresh |
|---|---|---|---|
| Unit | Synthetic | Minimal | On-demand |
| Integration | Synthetic/Subset | Moderate | Per run |
| System | Masked production | Realistic | Weekly |
| Performance | Scaled synthetic | Production-like | Per release |
| 层级 | 数据来源 | 量级 | 刷新频率 |
|---|---|---|---|
| 单元测试 | 合成数据 | 最小量级 | 按需生成 |
| 集成测试 | 合成数据/子集数据 | 中等量级 | 每次运行时刷新 |
| 系统测试 | 掩码生产数据 | 真实量级 | 每周刷新 |
| 性能测试 | 规模化合成数据 | 接近生产环境量级 | 每次发布前刷新 |
By Feature Area
按功能模块划分
| Feature | Critical Data | Volume Required | Sensitivity |
|---|---|---|---|
| Authentication | User accounts | 1000 | High |
| Payments | Transactions | 10000 | High |
| Reporting | Historical data | 1M records | Medium |
| 功能 | 关键数据 | 所需量级 | 敏感度 |
|---|---|---|---|
| 身份验证 | 用户账户 | 1000条 | 高 |
| 支付功能 | 交易记录 | 10000条 | 高 |
| 报表功能 | 历史数据 | 100万条记录 | 中等 |
2. Data Generation Strategy
2. 数据生成策略
Synthetic Data Tools
合成数据工具
- Unit Tests: AutoFixture, Bogus
- Integration: TestContainers + Seed
- Performance: Bulk generators
- 单元测试: AutoFixture, Bogus
- 集成测试: TestContainers + Seed
- 性能测试: 批量生成工具
Generation Rules
生成规则
| Entity | Key Fields | Generation Logic |
|---|---|---|
| User | | |
| Order | Amount | |
| Date | Timestamp | |
| 实体 | 关键字段 | 生成逻辑 |
|---|---|---|
| 用户 | 邮箱 | |
| 订单 | 金额 | |
| 日期 | 时间戳 | |
3. Data Anonymization
3. 数据匿名化
PII Fields
PII字段
| Field | Original | Anonymization Method |
|---|---|---|
| Name | John Smith | Faker generated |
| john@acme.com | | |
| Phone | 555-123-4567 | |
| SSN | 123-45-6789 | |
| Address | 123 Main St | Faker address |
| DOB | 1985-03-15 | Shift by random days |
| 字段 | 原始值 | 匿名化方法 |
|---|---|---|
| 姓名 | John Smith | Faker生成虚假数据 |
| 邮箱 | john@acme.com | |
| 电话 | 555-123-4567 | |
| 社保号 | 123-45-6789 | |
| 地址 | 123 Main St | Faker生成虚假地址 |
| 出生日期 | 1985-03-15 | 随机偏移天数 |
Anonymization Rules
匿名化规则
- Preserve data relationships
- Maintain referential integrity
- Keep statistical properties
- Remove unique identifiers
- 保留数据关联关系
- 维持引用完整性
- 保留统计属性
- 移除唯一标识符
4. Environment Strategy
4. 环境策略
Dev Environment
开发环境
- Source: 100% synthetic
- Refresh: On-demand
- Volume: Minimal
- 来源: 100%合成数据
- 刷新频率: 按需生成
- 量级: 最小量级
QA Environment
QA环境
- Source: Masked production subset
- Refresh: Weekly
- Volume: 10% of production
- 来源: 掩码生产数据子集
- 刷新频率: 每周
- 量级: 生产环境的10%
Staging Environment
预发布环境
- Source: Masked production clone
- Refresh: Before each release
- Volume: 100% of production
- 来源: 掩码生产环境克隆
- 刷新频率: 每次发布前
- 量级: 100%生产环境量级
Performance Environment
性能测试环境
- Source: Scaled synthetic
- Refresh: Before performance runs
- Volume: 150% of production
- 来源: 规模化合成数据
- 刷新频率: 性能测试前
- 量级: 生产环境的150%
5. Data Versioning
5. 数据版本控制
Baseline Management
基准数据管理
- Version baseline data sets
- Track data schema changes
- Maintain backward compatibility
- Document data dependencies
- 对基准数据集进行版本控制
- 跟踪数据架构变更
- 维持向后兼容性
- 记录数据依赖关系
Refresh Procedures
刷新流程
- Trigger: [Manual/Scheduled/Event]
- Source: [Production/Backup/Generator]
- Transform: [Anonymization steps]
- Load: [Target environment]
- Validate: [Verification checks]
- 触发方式: [手动/定时/事件驱动]
- 数据来源: [生产环境/备份/生成工具]
- 转换处理: [匿名化步骤]
- 加载目标: [目标环境]
- 验证: [校验检查]
6. Compliance Requirements
6. 合规要求
GDPR Compliance
GDPR合规
- No real EU citizen data in non-prod
- Right to erasure supported
- Data minimization applied
- Consent tracking anonymized
- 非生产环境中无真实欧盟公民数据
- 支持数据删除权
- 应用数据最小化原则
- 匿名化同意记录
HIPAA Compliance
HIPAA合规
- PHI fully de-identified
- Safe Harbor method applied
- Audit logs maintained
- Access controls verified
undefined- 受保护健康信息(PHI)完全去标识化
- 应用安全港方法
- 保留审计日志
- 验证访问控制
undefinedSynthetic Data Generation (.NET)
合成数据生成 (.NET)
Using Bogus
使用Bogus
csharp
using Bogus;
public class TestDataGenerator
{
public static Faker<Customer> CustomerFaker => new Faker<Customer>()
.RuleFor(c => c.Id, f => f.Random.Guid())
.RuleFor(c => c.FirstName, f => f.Person.FirstName)
.RuleFor(c => c.LastName, f => f.Person.LastName)
.RuleFor(c => c.Email, (f, c) => f.Internet.Email(c.FirstName, c.LastName))
.RuleFor(c => c.Phone, f => f.Phone.PhoneNumber())
.RuleFor(c => c.DateOfBirth, f => f.Date.Past(50, DateTime.Now.AddYears(-18)))
.RuleFor(c => c.Address, f => new Address
{
Street = f.Address.StreetAddress(),
City = f.Address.City(),
State = f.Address.StateAbbr(),
Zip = f.Address.ZipCode()
});
public static Faker<Order> OrderFaker(Customer customer) => new Faker<Order>()
.RuleFor(o => o.Id, f => f.Random.Guid())
.RuleFor(o => o.CustomerId, customer.Id)
.RuleFor(o => o.OrderDate, f => f.Date.Recent(30))
.RuleFor(o => o.Total, f => f.Finance.Amount(10, 1000))
.RuleFor(o => o.Status, f => f.PickRandom<OrderStatus>());
}csharp
using Bogus;
public class TestDataGenerator
{
public static Faker<Customer> CustomerFaker => new Faker<Customer>()
.RuleFor(c => c.Id, f => f.Random.Guid())
.RuleFor(c => c.FirstName, f => f.Person.FirstName)
.RuleFor(c => c.LastName, f => f.Person.LastName)
.RuleFor(c => c.Email, (f, c) => f.Internet.Email(c.FirstName, c.LastName))
.RuleFor(c => c.Phone, f => f.Phone.PhoneNumber())
.RuleFor(c => c.DateOfBirth, f => f.Date.Past(50, DateTime.Now.AddYears(-18)))
.RuleFor(c => c.Address, f => new Address
{
Street = f.Address.StreetAddress(),
City = f.Address.City(),
State = f.Address.StateAbbr(),
Zip = f.Address.ZipCode()
});
public static Faker<Order> OrderFaker(Customer customer) => new Faker<Order>()
.RuleFor(o => o.Id, f => f.Random.Guid())
.RuleFor(o => o.CustomerId, customer.Id)
.RuleFor(o => o.OrderDate, f => f.Date.Recent(30))
.RuleFor(o => o.Total, f => f.Finance.Amount(10, 1000))
.RuleFor(o => o.Status, f => f.PickRandom<OrderStatus>());
}Using AutoFixture
使用AutoFixture
csharp
using AutoFixture;
using AutoFixture.Xunit2;
public class CustomerTests
{
[Theory, AutoData]
public void CreateCustomer_WithValidData_Succeeds(Customer customer)
{
// AutoFixture generates valid Customer automatically
var result = _service.Create(customer);
Assert.True(result.IsSuccess);
}
[Theory, AutoData]
public void ProcessOrder_CalculatesCorrectTotal(
[Frozen] Customer customer,
Order order,
List<OrderItem> items)
{
// Frozen ensures customer is reused
// Order and items are auto-generated
order.Items = items;
var total = _calculator.Calculate(order);
Assert.Equal(items.Sum(i => i.Quantity * i.Price), total);
}
}csharp
using AutoFixture;
using AutoFixture.Xunit2;
public class CustomerTests
{
[Theory, AutoData]
public void CreateCustomer_WithValidData_Succeeds(Customer customer)
{
// AutoFixture自动生成有效的Customer实例
var result = _service.Create(customer);
Assert.True(result.IsSuccess);
}
[Theory, AutoData]
public void ProcessOrder_CalculatesCorrectTotal(
[Frozen] Customer customer,
Order order,
List<OrderItem> items)
{
// Frozen确保customer实例被复用
// Order和items会自动生成
order.Items = items;
var total = _calculator.Calculate(order);
Assert.Equal(items.Sum(i => i.Quantity * i.Price), total);
}
}Seeding Test Databases
测试数据库初始化
csharp
public class TestDatabaseSeeder
{
public static async Task SeedAsync(AppDbContext context)
{
// Clear existing data
await context.Database.ExecuteSqlRawAsync("DELETE FROM Orders");
await context.Database.ExecuteSqlRawAsync("DELETE FROM Customers");
// Generate test data
var customers = TestDataGenerator.CustomerFaker.Generate(100);
await context.Customers.AddRangeAsync(customers);
foreach (var customer in customers)
{
var orders = TestDataGenerator.OrderFaker(customer).Generate(5);
await context.Orders.AddRangeAsync(orders);
}
await context.SaveChangesAsync();
}
}csharp
public class TestDatabaseSeeder
{
public static async Task SeedAsync(AppDbContext context)
{
// 清除现有数据
await context.Database.ExecuteSqlRawAsync("DELETE FROM Orders");
await context.Database.ExecuteSqlRawAsync("DELETE FROM Customers");
// 生成测试数据
var customers = TestDataGenerator.CustomerFaker.Generate(100);
await context.Customers.AddRangeAsync(customers);
foreach (var customer in customers)
{
var orders = TestDataGenerator.OrderFaker(customer).Generate(5);
await context.Orders.AddRangeAsync(orders);
}
await context.SaveChangesAsync();
}
}Data Anonymization Techniques
数据匿名化技术
| Technique | Description | Use Case |
|---|---|---|
| Substitution | Replace with fake data | Names, emails |
| Shuffling | Rearrange within column | Salaries, dates |
| Masking | Partial hiding | SSN (xxx-xx-1234) |
| Generalization | Reduce precision | Age ranges, zip prefix |
| Nulling | Remove entirely | Unnecessary fields |
| Tokenization | Replace with token | Cross-reference needs |
| Hashing | One-way transform | Identifiers |
| 技术 | 描述 | 使用场景 |
|---|---|---|
| 替换法 | 用虚假数据替换 | 姓名、邮箱 |
| 洗牌法 | 在列内重新排列数据 | 薪资、日期 |
| 掩码法 | 部分隐藏数据 | 社保号(xxx-xx-1234) |
| 概化法 | 降低数据精度 | 年龄范围、邮编前缀 |
| 空值法 | 完全移除数据 | 非必要字段 |
| 令牌化 | 用令牌替换数据 | 需要交叉引用的场景 |
| 哈希法 | 单向转换 | 标识符 |
.NET Anonymization Example
.NET匿名化示例
csharp
public class DataAnonymizer
{
public Customer Anonymize(Customer source)
{
return new Customer
{
Id = source.Id, // Preserve for relationships
FirstName = _faker.Person.FirstName,
LastName = _faker.Person.LastName,
Email = $"{Guid.NewGuid():N}@test.example.com",
Phone = MaskPhone(source.Phone),
SSN = "xxx-xx-" + source.SSN.Substring(7, 4),
DateOfBirth = ShiftDate(source.DateOfBirth),
Address = new Address
{
Street = _faker.Address.StreetAddress(),
City = source.Address.City, // Preserve geography
State = source.Address.State,
Zip = source.Address.Zip.Substring(0, 3) + "00"
}
};
}
private string MaskPhone(string phone)
{
// Keep area code, mask rest
return Regex.Replace(phone, @"(\d{3})\d{3}(\d{4})", "$1-xxx-$2");
}
private DateTime ShiftDate(DateTime date)
{
// Shift by random days within ±30
return date.AddDays(_random.Next(-30, 30));
}
}csharp
public class DataAnonymizer
{
public Customer Anonymize(Customer source)
{
return new Customer
{
Id = source.Id, // 保留以维持关联关系
FirstName = _faker.Person.FirstName,
LastName = _faker.Person.LastName,
Email = $"{Guid.NewGuid():N}@test.example.com",
Phone = MaskPhone(source.Phone),
SSN = "xxx-xx-" + source.SSN.Substring(7, 4),
DateOfBirth = ShiftDate(source.DateOfBirth),
Address = new Address
{
Street = _faker.Address.StreetAddress(),
City = source.Address.City, // 保留地理信息
State = source.Address.State,
Zip = source.Address.Zip.Substring(0, 3) + "00"
}
};
}
private string MaskPhone(string phone)
{
// 保留区号,掩码其余部分
return Regex.Replace(phone, @"(\d{3})\d{3}(\d{4})", "$1-xxx-$2");
}
private DateTime ShiftDate(DateTime date)
{
// 随机偏移±30天内的日期
return date.AddDays(_random.Next(-30, 30));
}
}Test Data Patterns
测试数据模式
Builder Pattern
建造者模式
csharp
public class CustomerBuilder
{
private Customer _customer = new();
public CustomerBuilder WithName(string first, string last)
{
_customer.FirstName = first;
_customer.LastName = last;
return this;
}
public CustomerBuilder WithPremiumStatus()
{
_customer.IsPremium = true;
_customer.PremiumSince = DateTime.Now.AddYears(-1);
return this;
}
public CustomerBuilder WithOrders(int count)
{
_customer.Orders = TestDataGenerator.OrderFaker(_customer).Generate(count);
return this;
}
public Customer Build() => _customer;
}
// Usage
var customer = new CustomerBuilder()
.WithName("Test", "User")
.WithPremiumStatus()
.WithOrders(5)
.Build();csharp
public class CustomerBuilder
{
private Customer _customer = new();
public CustomerBuilder WithName(string first, string last)
{
_customer.FirstName = first;
_customer.LastName = last;
return this;
}
public CustomerBuilder WithPremiumStatus()
{
_customer.IsPremium = true;
_customer.PremiumSince = DateTime.Now.AddYears(-1);
return this;
}
public CustomerBuilder WithOrders(int count)
{
_customer.Orders = TestDataGenerator.OrderFaker(_customer).Generate(count);
return this;
}
public Customer Build() => _customer;
}
// 使用示例
var customer = new CustomerBuilder()
.WithName("Test", "User")
.WithPremiumStatus()
.WithOrders(5)
.Build();Object Mother Pattern
对象母版模式
csharp
public static class TestCustomers
{
public static Customer ValidCustomer() => new()
{
Id = Guid.NewGuid(),
FirstName = "Test",
LastName = "User",
Email = "test@example.com",
Status = CustomerStatus.Active
};
public static Customer PremiumCustomer() => new()
{
Id = Guid.NewGuid(),
FirstName = "Premium",
LastName = "User",
Email = "premium@example.com",
IsPremium = true,
Status = CustomerStatus.Active
};
public static Customer InactiveCustomer() => new()
{
Id = Guid.NewGuid(),
Status = CustomerStatus.Inactive
};
}csharp
public static class TestCustomers
{
public static Customer ValidCustomer() => new()
{
Id = Guid.NewGuid(),
FirstName = "Test",
LastName = "User",
Email = "test@example.com",
Status = CustomerStatus.Active
};
public static Customer PremiumCustomer() => new()
{
Id = Guid.NewGuid(),
FirstName = "Premium",
LastName = "User",
Email = "premium@example.com",
IsPremium = true,
Status = CustomerStatus.Active
};
public static Customer InactiveCustomer() => new()
{
Id = Guid.NewGuid(),
Status = CustomerStatus.Inactive
};
}Integration Points
集成对接
Inputs from:
- Data model → Test data structure
- Privacy requirements → Anonymization rules
- skill → Data volume needs
test-strategy-planning
Outputs to:
- Test automation → Data fixtures
- skill → Load data
performance-test-planning - Environment provisioning → Seed scripts
输入来源:
- 数据模型 → 测试数据结构
- 隐私要求 → 匿名化规则
- 技能 → 数据量级需求
test-strategy-planning
输出对接:
- 测试自动化 → 数据夹具
- 技能 → 负载数据
performance-test-planning - 环境部署 → 初始化脚本