data-validation

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Data Validation Skill

数据验证技能

Provides comprehensive guidance for input validation, response serialization, and ID management.
为输入验证、响应序列化和ID管理提供全面指导。

Core Concepts

核心概念

API Contract vs Internal Implementation:
┌─────────────────────────────────────────┐
│          Client (External)              │
│    snake_case IDs, user-friendly        │
└──────────────┬──────────────────────────┘
         HTTP API Boundary
┌──────────────▼──────────────────────────┐
│    Your Application (Internal)          │
│    camelCase IDs, database structure    │
└─────────────────────────────────────────┘
Responsibilities at each layer:
LayerResponsibility
ControllerValidate input (schema), serialize output (schema)
ServiceBusiness logic with internal format
RepositoryDatabase access with internal format
DatabaseStorage with primary keys
API契约与内部实现:
┌─────────────────────────────────────────┐
│          Client (External)              │
│    snake_case IDs, user-friendly        │
└──────────────┬──────────────────────────┘
         HTTP API Boundary
┌──────────────▼──────────────────────────┐
│    Your Application (Internal)          │
│    camelCase IDs, database structure    │
└─────────────────────────────────────────┘
各层职责:
层级职责描述
控制器层验证输入(Schema)、序列化输出(Schema)
服务层使用内部格式处理业务逻辑
仓储层使用内部格式访问数据库
数据库存储主键

ID Management Pattern

ID管理模式

External API Contract

外部API契约

URL:
  GET /admin/users/:id     // id = uid (user_abc123)

Response:
{
  "id": "user_abc123",                  // UID mapped as id
  "email": "user@example.com",
  "name": "John Doe",
  // NO "uid" field
  // NO database "id" field (bigint primary key)
}
URL:
  GET /admin/users/:id     // id = uid (user_abc123)

Response:
{
  "id": "user_abc123",                  // UID映射为id
  "email": "user@example.com",
  "name": "John Doe",
  // 无"uid"字段
  // 无数据库"id"字段(bigint主键)
}

Internal Implementation

内部实现

Database:
{
  id: 12345,                // bigint primary key (NEVER exposed)
  uid: "user_abc123",       // Branded UID
  email: "user@example.com",
  name: "John Doe",
}

Services and Repositories:
- Use uid: "user_abc123"
- Never expose id: 12345
- Query by uid: WHERE uid = 'user_abc123'
Database:
{
  id: 12345,                // bigint主键(绝对不可暴露)
  uid: "user_abc123",       // 带标识的UID
  email: "user@example.com",
  name: "John Doe",
}

Services and Repositories:
- 使用uid: "user_abc123"
- 绝对不要暴露id: 12345
- 通过uid查询: WHERE uid = 'user_abc123'

UID Format

UID格式

Pattern: {PREFIX}_{RANDOM_ID}

Examples:
- user_abc123
- show_xyz789
- client_def456
- studio_ghi012

Key Rules:
- ✅ Prefix has no trailing underscore
- ✅ Use cryptographically secure random
- ✅ Make globally unique
- ❌ Never expose database ID pattern
- ❌ Never expose prefix pattern in error messages
格式: {PREFIX}_{RANDOM_ID}

示例:
- user_abc123
- show_xyz789
- client_def456
- studio_ghi012

关键规则:
- ✅ 前缀末尾无下划线
- ✅ 使用加密安全的随机值
- ✅ 保证全局唯一
- ❌ 绝对不要暴露数据库ID格式
- ❌ 绝对不要在错误消息中暴露前缀格式

🔴 Critical: Never Compare Database IDs with UIDs

🔴 重点:绝对不要将数据库ID与UID进行比较

Database IDs (
BigInt
) and UIDs (
string
) are fundamentally different types. Never compare them directly — this is a common source of bugs that silently fails.
❌ BAD: Comparing BigInt database ID with UID string
// entity.studioId is BigInt (e.g., 12345n)
// studioId is a UID string (e.g., "std_abc123")
if (entity.studioId?.toString() !== studioId) { ... }
// BigInt.toString() gives "12345", which NEVER equals "std_abc123"
// This comparison ALWAYS fails!

✅ GOOD: Use query-based scoping
// Let the database handle the scoping in the query itself
const entity = await this.service.findOne({
  uid: entityUid,
  studio: { uid: studioId },  // Prisma resolves the relation
  deletedAt: null,
});
if (!entity) { throw HttpError.notFound(...); }

✅ ACCEPTABLE: Resolve UID to ID first, then compare BigInt-to-BigInt
const studio = await this.studioService.findByUid(studioUid);
if (entity.studioId !== studio.id) { ... }  // BigInt === BigInt
数据库ID(
BigInt
)和UID(
string
)是完全不同的类型。绝对不要直接比较它们——这是导致静默失败的常见bug来源。
❌ 错误示例:将BigInt类型的数据库ID与UID字符串比较
// entity.studioId是BigInt类型(如:12345n)
// studioId是UID字符串(如:"std_abc123")
if (entity.studioId?.toString() !== studioId) { ... }
// BigInt.toString()返回"12345",永远不会等于"std_abc123"
// 这个比较永远会失败!

✅ 正确示例:使用基于查询的范围限定
// 让数据库在查询中自行处理范围限定
const entity = await this.service.findOne({
  uid: entityUid,
  studio: { uid: studioId },  // Prisma会解析关联关系
  deletedAt: null,
});
if (!entity) { throw HttpError.notFound(...); }

✅ 可接受方案:先将UID解析为ID,再进行BigInt与BigInt的比较
const studio = await this.studioService.findByUid(studioUid);
if (entity.studioId !== studio.id) { ... }  // BigInt === BigInt

Input Validation Pattern

输入验证模式

Validate at API boundary, transform format:
Client Request (snake_case):
{
  "email": "user@example.com",
  "user_id": "user_123",
  "is_banned": false
}
Validation Layer:
- Check required fields
- Check format (email, length, etc.)
- Check references exist (user_id)
Transform Layer (snake_case → camelCase):
{
  email: "user@example.com",
  userId: "user_123",
  isBanned: false
}
Service Layer (processes camelCase)
Validation Schema Example:
Input schema:
- email: string, email format, required
- name: string, min 1 char, max 255 chars
- is_banned: boolean, optional
- user_id: string, matches UID format

Validation rules:
- Transform snake_case → camelCase
- Check format of IDs (startsWith prefix)
- Check required fields
- Check string lengths
在API边界验证,转换格式:
客户端请求(snake_case):
{
  "email": "user@example.com",
  "user_id": "user_123",
  "is_banned": false
}
验证层:
- 检查必填字段
- 检查格式(邮箱、长度等)
- 检查引用实体是否存在(user_id)
转换层(snake_case → camelCase):
{
  email: "user@example.com",
  userId: "user_123",
  isBanned: false
}
服务层(处理camelCase格式)
验证Schema示例:
输入Schema:
- email: 字符串,邮箱格式,必填
- name: 字符串,最小1字符,最大255字符
- is_banned: 布尔值,可选
- user_id: 字符串,匹配UID格式

验证规则:
- 将snake_case转换为camelCase
- 检查ID格式(是否以指定前缀开头)
- 检查必填字段
- 检查字符串长度

Action Validation Rule (Workflow Endpoints)

动作验证规则(工作流端点)

For workflow/action endpoints, validate action intent explicitly instead of relying on generic update schemas.
Required patterns:
  1. action enum validation (
    resolution_action
    , etc.),
  2. required reason/metadata fields for audited transitions,
  3. deterministic domain error payloads for policy violations (for example active-task blocking),
  4. consistent external field naming in contract (
    id
    ,
    external_id
    , snake_case).
undefined
对于工作流/动作端点,要显式验证动作意图,而非依赖通用更新Schema。
必填规则:
  1. 动作枚举验证(如
    resolution_action
    等),
  2. 审计状态转换所需的必填原因/元数据字段,
  3. 策略违规时返回确定性领域错误负载(例如活跃任务阻塞),
  4. 契约中使用一致的外部字段命名(
    id
    ,
    external_id
    , snake_case)。
undefined

Response Serialization Pattern

响应序列化模式

Transform internal format to API format:
Service returns (camelCase, internal):
{
  id: 12345n,                 // database ID (never in response!)
  uid: "user_abc123",         // UID (maps to "id")
  email: "user@example.com",
  isBanned: false,
  createdAt: Date,
  updatedAt: Date,
}
Serialization Layer:
- Map uid → id
- Hide database id field
- Transform camelCase → snake_case
- Transform dates to ISO format
Client receives (snake_case, friendly):
{
  "id": "user_abc123",        // UID as id
  "email": "user@example.com",
  "is_banned": false,
  "created_at": "2025-01-14T10:00:00Z",
  "updated_at": "2025-01-14T10:00:00Z"
  // NO "uid" field
  // NO database "id" field
}
Serialization Schema Example:
Output schema (from service):
- uid: string
- email: string
- isBanned: boolean
- createdAt: Date
- updatedAt: Date

Transform to DTO:
- uid → id
- isBanned → is_banned
- createdAt → created_at
- updatedAt → updated_at
将内部格式转换为API格式:
服务层返回结果(camelCase,内部格式):
{
  id: 12345n,                 // 数据库ID(绝对不要出现在响应中!)
  uid: "user_abc123",         // UID(映射为"id")
  email: "user@example.com",
  isBanned: false,
  createdAt: Date,
  updatedAt: Date,
}
序列化层:
- 将uid映射为id
- 隐藏数据库id字段
- 将camelCase转换为snake_case
- 将日期转换为ISO格式
客户端接收结果(snake_case,友好格式):
{
  "id": "user_abc123",        // UID作为id
  "email": "user@example.com",
  "is_banned": false,
  "created_at": "2025-01-14T10:00:00Z",
  "updated_at": "2025-01-14T10:00:00Z"
  // 无"uid"字段
  // 无数据库"id"字段
}
序列化Schema示例:
服务层输出Schema:
- uid: 字符串
- email: 字符串
- isBanned: 布尔值
- createdAt: 日期
- updatedAt: 日期

转换为DTO:
- uid → id
- isBanned → is_banned
- createdAt → created_at
- updatedAt → updated_at

Nested Validation

嵌套验证

Validate related entities by UID:
Input (user creating a show):
{
  "name": "Studio A Show",
  "client_id": "client_123",      // Client UID
  "studio_room_id": "room_456",   // StudioRoom UID
  "show_type_id": "type_bau",     // ShowType UID
}

Validation:
1. Check string format (looks like UID)
2. Service verifies entity exists
3. Service queries by UID
4. If not found, throw not-found error
Key Rules:
  • ✅ Validate UID format (starts with prefix)
  • ✅ Service verifies entity exists (query)
  • ✅ Return not-found error if missing
  • ❌ Never assume IDs exist without checking
  • ❌ Never expose missing ID in error details
通过UID验证关联实体:
输入(用户创建演出):
{
  "name": "Studio A Show",
  "client_id": "client_123",      // 客户端UID
  "studio_room_id": "room_456",   // 演播室房间UID
  "show_type_id": "type_bau",     // 演出类型UID
}

验证步骤:
1. 检查字符串格式(是否符合UID规范)
2. 服务层验证实体是否存在
3. 服务层通过UID查询
4. 若不存在,返回未找到错误
关键规则:
  • ✅ 验证UID格式(是否以指定前缀开头)
  • ✅ 服务层验证实体是否存在(查询)
  • ✅ 若实体不存在,返回未找到错误
  • ❌ 绝对不要在未检查的情况下假设ID存在
  • ❌ 绝对不要在错误详情中暴露缺失的ID

Type Mapping

类型映射

Database → Service → API:
Database     | Service       | API Response
─────────────┼───────────────┼──────────────
bigint       | bigint        | string (UID)
string (uid) | string (uid)  | string (id)
boolean      | boolean       | boolean
timestamp    | Date          | ISO string
Transformation Examples:
Database integer → API string (UID):
  DB: { id: 12345, uid: "user_abc123" }
  Service: { uid: "user_abc123" }
  API: { "id": "user_abc123" }

Database TIMESTAMP → API ISO string:
  DB: { created_at: 2025-01-14 10:00:00 }
  Service: { createdAt: Date(2025-01-14 10:00:00) }
  API: { "created_at": "2025-01-14T10:00:00Z" }

Database boolean → API boolean:
  DB: { is_banned: true }
  Service: { isBanned: true }
  API: { "is_banned": true }
数据库 → 服务层 → API:
数据库类型     | 服务层类型       | API响应类型
─────────────┼───────────────┼──────────────
bigint       | bigint        | 字符串(UID)
string (uid) | string (uid)  | 字符串(id)
boolean      | boolean       | 布尔值
timestamp    | Date          | ISO字符串
转换示例:
数据库整数 → API字符串(UID):
  数据库: { id: 12345, uid: "user_abc123" }
  服务层: { uid: "user_abc123" }
  API: { "id": "user_abc123" }

数据库TIMESTAMP → API ISO字符串:
  数据库: { created_at: 2025-01-14 10:00:00 }
  服务层: { createdAt: Date(2025-01-14 10:00:00) }
  API: { "created_at": "2025-01-14T10:00:00Z" }

数据库布尔值 → API布尔值:
  数据库: { is_banned: true }
  服务层: { isBanned: true }
  API: { "is_banned": true }

Pagination Validation

分页验证

Validate pagination parameters:
Input:
{
  "page": "1",      // String from query param
  "limit": "10"
}

Validation:
- Convert to number
- Check >= 1
- Check <= max (e.g., 100)
- Provide defaults (page: 1, limit: 10)

Output:
{
  page: 1,
  limit: 10
}
验证分页参数:
输入:
{
  "page": "1",      // 从查询参数获取的字符串
  "limit": "10"
}

验证步骤:
- 转换为数字类型
- 检查数值 >= 1
- 检查数值 <= 最大值(如100)
- 提供默认值(page: 1, limit: 10)

输出:
{
  page: 1,
  limit: 10
}

Error Messages

错误消息

Security-conscious error messages:
✅ GOOD: Context-specific, doesn't expose internals
{
  "statusCode": 404,
  "message": "User not found",
  "error": "NotFound"
}

✅ GOOD: Validation error with field info
{
  "statusCode": 400,
  "message": "Validation failed",
  "error": "BadRequest",
  "details": [
    { "field": "email", "message": "Invalid email format" }
  ]
}

❌ BAD: Exposes internal ID format
{
  "statusCode": 404,
  "message": "User uid_123 not found"  // Reveals UID pattern
}

❌ BAD: Exposes database structure
{
  "statusCode": 404,
  "message": "No row with id 12345 found"  // Reveals database ID
}
注重安全的错误消息:
✅ 正确示例:上下文明确,不暴露内部细节
{
  "statusCode": 404,
  "message": "用户不存在",
  "error": "NotFound"
}

✅ 正确示例:包含字段信息的验证错误
{
  "statusCode": 400,
  "message": "验证失败",
  "error": "BadRequest",
  "details": [
    { "field": "email", "message": "邮箱格式无效" }
  ]
}

❌ 错误示例:暴露内部ID格式
{
  "statusCode": 404,
  "message": "用户uid_123不存在"  // 泄露了UID格式
}

❌ 错误示例:暴露数据库结构
{
  "statusCode": 404,
  "message": "未找到ID为12345的记录"  // 泄露了数据库ID
}

Best Practices Checklist

最佳实践检查清单

  • Validate all input at controller boundary
  • Use schema validation (Zod, Joi, Pydantic)
  • Transform snake_case → camelCase on input
  • Map uid → id in API responses
  • Hide database primary keys completely
  • Transform camelCase → snake_case on output
  • Check UID format (matches prefix pattern)
  • Validate referenced entities exist
  • Return not-found error if entity missing
  • Support pagination with validation
  • Convert timestamps to ISO format
  • Error messages don't expose internal structure
  • Error messages are actionable for clients
  • Serialization is consistent across endpoints
  • No sensitive data in responses
  • 在控制器边界验证所有输入
  • 使用Schema验证工具(Zod、Joi、Pydantic)
  • 输入时将snake_case转换为camelCase
  • 在API响应中将uid映射为id
  • 完全隐藏数据库主键
  • 输出时将camelCase转换为snake_case
  • 检查UID格式(是否匹配前缀规则)
  • 验证关联实体是否存在
  • 若实体缺失,返回未找到错误
  • 支持带验证的分页功能
  • 将时间戳转换为ISO格式
  • 错误消息不暴露内部结构
  • 错误消息对客户端具有可操作性
  • 所有端点的序列化规则保持一致
  • 响应中不包含敏感数据

Related Skills

相关技能

  • Backend Controller Pattern NestJS - Validation at HTTP boundary
  • Service Pattern NestJS - Business logic validation
  • Repository Pattern NestJS - Data access layer
  • Backend Controller Pattern NestJS - HTTP边界处的验证
  • Service Pattern NestJS - 业务逻辑验证
  • Repository Pattern NestJS - 数据访问层