pentest-whitebox-code-review

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Pentest Whitebox Code Review

渗透测试白盒代码评审

Purpose

目的

Perform systematic white-box source code security audit using Shannon's backward taint analysis methodology. Traces from dangerous sinks back to user-controlled sources, classifies injection contexts by slot type, verifies XSS render contexts, and produces a prioritized exploitation queue for downstream proof-driven exploitation.
采用Shannon反向污点分析方法论,开展系统性的白盒源代码安全审计。从危险的输出点(sink)追溯至用户可控的输入源,按插槽类型对注入上下文进行分类,验证XSS渲染上下文,并生成优先级排序的漏洞利用队列,供下游基于验证的漏洞利用工作使用。

Prerequisites

前置条件

Authorization Requirements

授权要求

  • Written authorization with explicit scope for source code review
  • Source code access — full repository with version control history
  • Architecture documentation if available (data flow diagrams, API specs)
  • Deployment configuration access (environment variables, secrets management)
  • 书面授权:明确包含源代码评审的范围
  • 源代码访问权限:含版本控制历史的完整仓库
  • 架构文档(如有):数据流图、API规范
  • 部署配置访问权限:环境变量、密钥管理配置

Environment Setup

环境搭建

  • semgrep with custom rules for taint analysis
  • CodeQL database built for target language
  • ripgrep for fast pattern searching
  • jadx for Android APK decompilation (if applicable)
  • Source map extraction tools for minified JavaScript
  • AST parsing tools for target language (tree-sitter, babel, etc.)
  • 用于污点分析的自定义规则semgrep
  • 为目标语言构建的CodeQL数据库
  • 用于快速模式搜索的ripgrep
  • 用于Android APK反编译的jadx(如适用)
  • 用于压缩JavaScript的源码映射提取工具
  • 目标语言的AST解析工具(tree-sitter、babel等)

Core Workflow

核心工作流

Phase 1: Discovery

阶段1:发现

  1. Architecture Mapping: Identify application layers (routing, controllers, services, data access, templates). Map data flow from HTTP entry points through business logic to database/file/external sinks.
  2. Entry Point Enumeration: Catalog all user-controlled input sources — HTTP parameters, headers, cookies, file uploads, WebSocket messages, environment variables, database reads of user-stored data.
  3. Security Pattern Inventory: Identify existing security controls — input validation functions, output encoding helpers, parameterized query patterns, CSRF protections, authentication middleware, rate limiters.
  1. 架构映射:识别应用层(路由、控制器、服务、数据访问、模板)。绘制从HTTP入口点经业务逻辑到数据库/文件/外部输出点的数据流图。
  2. 入口点枚举:记录所有用户可控输入源——HTTP参数、请求头、Cookie、文件上传、WebSocket消息、环境变量、存储用户数据的数据库读取结果。
  3. 安全模式盘点:识别现有安全控制措施——输入验证函数、输出编码助手、参数化查询模式、CSRF防护、认证中间件、速率限制器。

Phase 2: Vulnerability Analysis (5 Parallel Tracks)

阶段2:漏洞分析(5个并行追踪方向)

  1. Injection Sink Hunting: Backward taint from SQL/command/file/template sinks to sources. Classify each sink by slot type: SQL-val, SQL-ident, CMD-argument, FILE-path, TEMPLATE-expr. Verify whether parameterization or sanitization breaks the taint chain.
  2. XSS Render Context Analysis: Identify all dynamic output points in templates/responses. Classify each by render context: HTML_BODY, HTML_ATTRIBUTE, JAVASCRIPT_STRING, URL_PARAM, CSS_VALUE. Verify context-appropriate encoding is applied at each output point.
  3. Authentication Checklist (9-point): Transport security, rate limiting, session management, token properties, session fixation resistance, password policy enforcement, login response uniformity, account recovery security, SSO/OAuth implementation.
  4. Authorization Model Review (3-type): Horizontal (same-role cross-user access), vertical (privilege escalation across roles), context-workflow (state-dependent authorization bypass).
  5. SSRF Sink Hunting: Identify all outbound request sinks. Classify by type: classic (direct URL), blind (no response), semi-blind (partial response), stored (deferred execution). Trace URL construction from user input to request dispatch.
  1. 注入输出点排查:从SQL/命令/文件/模板输出点反向追溯至输入源。按插槽类型对每个输出点分类:SQL-val、SQL-ident、CMD-argument、FILE-path、TEMPLATE-expr。验证参数化或sanitization措施是否中断污点链。
  2. XSS渲染上下文分析:识别模板/响应中的所有动态输出点。按渲染上下文分类:HTML_BODY、HTML_ATTRIBUTE、JAVASCRIPT_STRING、URL_PARAM、CSS_VALUE。验证每个输出点是否应用了符合上下文要求的编码。
  3. 认证清单(9项):传输安全、速率限制、会话管理、令牌属性、会话固定防护、密码策略执行、登录响应一致性、账号恢复安全性、SSO/OAuth实现。
  4. 授权模型评审(3类):横向(同角色跨用户访问)、纵向(跨角色权限提升)、上下文工作流(基于状态的授权绕过)。
  5. SSRF输出点排查:识别所有 outbound 请求输出点。按类型分类:经典型(直接URL)、盲型(无响应)、半盲型(部分响应)、存储型(延迟执行)。追踪从用户输入到请求分发的URL构建流程。

Phase 3: Synthesis

阶段3:综合分析

  1. Confidence Scoring & Exploitation Queue: Score each finding by taint chain completeness, sanitization bypass likelihood, and impact severity. Generate exploitation queue JSON for downstream exploit validation.
  1. 置信度评分与漏洞利用队列:根据污点链完整性、sanitization绕过可能性、影响严重程度对每个发现进行评分。生成JSON格式的漏洞利用队列,供下游漏洞验证使用。

Slot Type Classification

插槽类型分类

Slot TypeSink PatternSanitization Required
SQL-valQuery parameter value positionParameterized query / prepared statement
SQL-identTable name, column name, ORDER BYAllowlist validation
CMD-argumentShell command argumentArgument escaping + allowlist
FILE-pathFile read/write path constructionPath canonicalization + allowlist
TEMPLATE-exprTemplate engine expressionContext-aware auto-escaping
Slot Type输出点模式所需Sanitization措施
SQL-val查询参数值位置参数化查询/预编译语句
SQL-ident表名、列名、ORDER BY白名单验证
CMD-argumentShell命令参数参数转义+白名单
FILE-path文件读写路径构造路径规范化+白名单
TEMPLATE-expr模板引擎表达式上下文感知自动转义

Render Context Classification

渲染上下文分类

ContextOutput LocationEncoding Required
HTML_BODYBetween HTML tagsHTML entity encoding
HTML_ATTRIBUTEInside attribute valuesAttribute encoding + quoting
JAVASCRIPT_STRINGInside JS string literalsJavaScript Unicode escaping
URL_PARAMURL query parameter valuesURL percent encoding
CSS_VALUEInside CSS property valuesCSS hex encoding
Context输出位置所需编码方式
HTML_BODYHTML标签之间HTML实体编码
HTML_ATTRIBUTE属性值内部属性编码+引号包裹
JAVASCRIPT_STRINGJS字符串字面量内部JavaScript Unicode转义
URL_PARAMURL查询参数值URL百分号编码
CSS_VALUECSS属性值内部CSS十六进制编码

Tool Categories

工具分类

CategoryToolsPurpose
Taint Analysissemgrep, CodeQLAutomated sink-to-source taint tracing
Pattern Searchripgrep, ast-grepFast code pattern matching
Decompilationjadx, sourcemap-extractRecover source from compiled artifacts
AST Parsingtree-sitter, babelLanguage-aware code structure analysis
Dependency Auditnpm audit, pip-audit, snykKnown vulnerability detection
分类工具用途
污点分析semgrep、CodeQL自动化输出点到输入源的污点追踪
模式搜索ripgrep、ast-grep快速代码模式匹配
反编译jadx、sourcemap-extract从编译产物恢复源代码
AST解析tree-sitter、babel基于语言特性的代码结构分析
依赖审计npm audit、pip-audit、snyk已知漏洞检测

References

参考资料

  • references/tools.md
    - Tool function signatures and parameters
  • references/workflows.md
    - Taint analysis workflows and vulnerability patterns
  • references/tools.md
    - 工具函数签名与参数
  • references/workflows.md
    - 污点分析工作流与漏洞模式