pentest-whitebox-code-review

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Pentest Whitebox Code Review

渗透测试白盒代码评审

Purpose

目的

Perform systematic white-box source code security audit using Shannon's backward taint analysis methodology. Traces from dangerous sinks back to user-controlled sources, classifies injection contexts by slot type, verifies XSS render contexts, and produces a prioritized exploitation queue for downstream proof-driven exploitation.

采用Shannon反向污点分析方法论，开展系统性的白盒源代码安全审计。从危险的输出点（sink）追溯至用户可控的输入源，按插槽类型对注入上下文进行分类，验证XSS渲染上下文，并生成优先级排序的漏洞利用队列，供下游基于验证的漏洞利用工作使用。

Prerequisites

前置条件

Authorization Requirements

授权要求

Written authorization with explicit scope for source code review
Source code access — full repository with version control history
Architecture documentation if available (data flow diagrams, API specs)
Deployment configuration access (environment variables, secrets management)

书面授权：明确包含源代码评审的范围
源代码访问权限：含版本控制历史的完整仓库
架构文档（如有）：数据流图、API规范
部署配置访问权限：环境变量、密钥管理配置

Environment Setup

环境搭建

semgrep with custom rules for taint analysis
CodeQL database built for target language
ripgrep for fast pattern searching
jadx for Android APK decompilation (if applicable)
Source map extraction tools for minified JavaScript
AST parsing tools for target language (tree-sitter, babel, etc.)

用于污点分析的自定义规则semgrep
为目标语言构建的CodeQL数据库
用于快速模式搜索的ripgrep
用于Android APK反编译的jadx（如适用）
用于压缩JavaScript的源码映射提取工具
目标语言的AST解析工具（tree-sitter、babel等）

Core Workflow

核心工作流

Phase 1: Discovery

阶段1：发现

Architecture Mapping: Identify application layers (routing, controllers, services, data access, templates). Map data flow from HTTP entry points through business logic to database/file/external sinks.
Entry Point Enumeration: Catalog all user-controlled input sources — HTTP parameters, headers, cookies, file uploads, WebSocket messages, environment variables, database reads of user-stored data.
Security Pattern Inventory: Identify existing security controls — input validation functions, output encoding helpers, parameterized query patterns, CSRF protections, authentication middleware, rate limiters.

架构映射：识别应用层（路由、控制器、服务、数据访问、模板）。绘制从HTTP入口点经业务逻辑到数据库/文件/外部输出点的数据流图。
入口点枚举：记录所有用户可控输入源——HTTP参数、请求头、Cookie、文件上传、WebSocket消息、环境变量、存储用户数据的数据库读取结果。
安全模式盘点：识别现有安全控制措施——输入验证函数、输出编码助手、参数化查询模式、CSRF防护、认证中间件、速率限制器。

Phase 2: Vulnerability Analysis (5 Parallel Tracks)

阶段2：漏洞分析（5个并行追踪方向）

Injection Sink Hunting: Backward taint from SQL/command/file/template sinks to sources. Classify each sink by slot type: SQL-val, SQL-ident, CMD-argument, FILE-path, TEMPLATE-expr. Verify whether parameterization or sanitization breaks the taint chain.
XSS Render Context Analysis: Identify all dynamic output points in templates/responses. Classify each by render context: HTML_BODY, HTML_ATTRIBUTE, JAVASCRIPT_STRING, URL_PARAM, CSS_VALUE. Verify context-appropriate encoding is applied at each output point.
Authentication Checklist (9-point): Transport security, rate limiting, session management, token properties, session fixation resistance, password policy enforcement, login response uniformity, account recovery security, SSO/OAuth implementation.
Authorization Model Review (3-type): Horizontal (same-role cross-user access), vertical (privilege escalation across roles), context-workflow (state-dependent authorization bypass).
SSRF Sink Hunting: Identify all outbound request sinks. Classify by type: classic (direct URL), blind (no response), semi-blind (partial response), stored (deferred execution). Trace URL construction from user input to request dispatch.

注入输出点排查：从SQL/命令/文件/模板输出点反向追溯至输入源。按插槽类型对每个输出点分类：SQL-val、SQL-ident、CMD-argument、FILE-path、TEMPLATE-expr。验证参数化或sanitization措施是否中断污点链。
XSS渲染上下文分析：识别模板/响应中的所有动态输出点。按渲染上下文分类：HTML_BODY、HTML_ATTRIBUTE、JAVASCRIPT_STRING、URL_PARAM、CSS_VALUE。验证每个输出点是否应用了符合上下文要求的编码。
认证清单（9项）：传输安全、速率限制、会话管理、令牌属性、会话固定防护、密码策略执行、登录响应一致性、账号恢复安全性、SSO/OAuth实现。
授权模型评审（3类）：横向（同角色跨用户访问）、纵向（跨角色权限提升）、上下文工作流（基于状态的授权绕过）。
SSRF输出点排查：识别所有 outbound 请求输出点。按类型分类：经典型（直接URL）、盲型（无响应）、半盲型（部分响应）、存储型（延迟执行）。追踪从用户输入到请求分发的URL构建流程。

Phase 3: Synthesis

阶段3：综合分析

Confidence Scoring & Exploitation Queue: Score each finding by taint chain completeness, sanitization bypass likelihood, and impact severity. Generate exploitation queue JSON for downstream exploit validation.

置信度评分与漏洞利用队列：根据污点链完整性、sanitization绕过可能性、影响严重程度对每个发现进行评分。生成JSON格式的漏洞利用队列，供下游漏洞验证使用。

Slot Type Classification

插槽类型分类

Slot Type	Sink Pattern	Sanitization Required
SQL-val	Query parameter value position	Parameterized query / prepared statement
SQL-ident	Table name, column name, ORDER BY	Allowlist validation
CMD-argument	Shell command argument	Argument escaping + allowlist
FILE-path	File read/write path construction	Path canonicalization + allowlist
TEMPLATE-expr	Template engine expression	Context-aware auto-escaping

Slot Type	输出点模式	所需Sanitization措施
SQL-val	查询参数值位置	参数化查询/预编译语句
SQL-ident	表名、列名、ORDER BY	白名单验证
CMD-argument	Shell命令参数	参数转义+白名单
FILE-path	文件读写路径构造	路径规范化+白名单
TEMPLATE-expr	模板引擎表达式	上下文感知自动转义

Render Context Classification

渲染上下文分类

Context	Output Location	Encoding Required
HTML_BODY	Between HTML tags	HTML entity encoding
HTML_ATTRIBUTE	Inside attribute values	Attribute encoding + quoting
JAVASCRIPT_STRING	Inside JS string literals	JavaScript Unicode escaping
URL_PARAM	URL query parameter values	URL percent encoding
CSS_VALUE	Inside CSS property values	CSS hex encoding

Context	输出位置	所需编码方式
HTML_BODY	HTML标签之间	HTML实体编码
HTML_ATTRIBUTE	属性值内部	属性编码+引号包裹
JAVASCRIPT_STRING	JS字符串字面量内部	JavaScript Unicode转义
URL_PARAM	URL查询参数值	URL百分号编码
CSS_VALUE	CSS属性值内部	CSS十六进制编码

Tool Categories

工具分类

Category	Tools	Purpose
Taint Analysis	semgrep, CodeQL	Automated sink-to-source taint tracing
Pattern Search	ripgrep, ast-grep	Fast code pattern matching
Decompilation	jadx, sourcemap-extract	Recover source from compiled artifacts
AST Parsing	tree-sitter, babel	Language-aware code structure analysis
Dependency Audit	npm audit, pip-audit, snyk	Known vulnerability detection

分类	工具	用途
污点分析	semgrep、CodeQL	自动化输出点到输入源的污点追踪
模式搜索	ripgrep、ast-grep	快速代码模式匹配
反编译	jadx、sourcemap-extract	从编译产物恢复源代码
AST解析	tree-sitter、babel	基于语言特性的代码结构分析
依赖审计	npm audit、pip-audit、snyk	已知漏洞检测

References

参考资料

```
references/tools.md
```
- Tool function signatures and parameters
```
references/workflows.md
```
- Taint analysis workflows and vulnerability patterns

```
references/tools.md
```
- 工具函数签名与参数
```
references/workflows.md
```
- 污点分析工作流与漏洞模式