codeql

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

CodeQL Static Analysis

CodeQL静态分析

When to Use CodeQL

何时使用CodeQL

Ideal scenarios:
  • Source code access with ability to build (for compiled languages)
  • Open-source projects or GitHub Advanced Security license
  • Need for interprocedural data flow and taint tracking
  • Finding complex vulnerabilities requiring AST/CFG analysis
  • Comprehensive security audits where analysis time is not critical
Consider Semgrep instead when:
  • No build capability for compiled languages
  • Licensing constraints
  • Need fast, lightweight pattern matching
  • Simple, single-file analysis is sufficient
理想场景:
  • 可访问源代码且具备编译能力(针对编译型语言)
  • 开源项目或拥有GitHub Advanced Security许可证
  • 需要过程间数据流和污点追踪分析
  • 查找需要AST/CFG分析的复杂漏洞
  • 对分析时间要求不高的全面安全审计
以下情况建议使用Semgrep替代:
  • 不具备编译型语言的编译能力
  • 存在许可证限制
  • 需要快速、轻量级的模式匹配
  • 简单的单文件分析即可满足需求

Why Interprocedural Analysis Matters

过程间分析的重要性

Simple grep/pattern tools only see one function at a time. Real vulnerabilities often span multiple functions:
HTTP Handler → Input Parser → Business Logic → Database Query
     ↓              ↓              ↓              ↓
   source      transforms       passes       sink (SQL)
CodeQL tracks data flow across all these steps. A tainted input in the handler can be traced through 5+ function calls to find where it reaches a dangerous sink.
Pattern-based tools miss this because they can't connect
request.param
in file A to
db.execute(query)
in file B.
简单的grep或模式匹配工具一次只能查看一个函数。而真实的漏洞往往跨越多个函数:
HTTP Handler → Input Parser → Business Logic → Database Query
     ↓              ↓              ↓              ↓
   source      transforms       passes       sink (SQL)
CodeQL可以追踪所有步骤间的数据流。处理器中的受污染输入可以被追踪经过5次以上的函数调用,找到其到达危险输出点的位置。
基于模式的工具会遗漏这类情况,因为它们无法将文件A中的
request.param
与文件B中的
db.execute(query)
关联起来。

When NOT to Use

何时不建议使用

Do NOT use this skill for:
  • Projects that cannot be built (CodeQL requires successful compilation for compiled languages)
  • Quick pattern searches (use Semgrep or grep for speed)
  • Non-security code quality checks (use linters instead)
  • Projects without source code access
请勿将此技能用于以下场景:
  • 无法编译的项目(CodeQL针对编译型语言需要成功编译)
  • 快速模式搜索(如需速度请使用Semgrep或grep)
  • 非安全类代码质量检查(请使用代码检查工具(linters)替代)
  • 无法访问源代码的项目

Environment Check

环境检查

bash
undefined
bash
undefined

Check if CodeQL is installed

检查CodeQL是否已安装

command -v codeql >/dev/null 2>&1 && echo "CodeQL: installed" || echo "CodeQL: NOT installed (run install steps below)"
undefined
command -v codeql >/dev/null 2>&1 && echo "CodeQL: installed" || echo "CodeQL: NOT installed (run install steps below)"
undefined

Installation

安装

CodeQL CLI

CodeQL CLI

bash
undefined
bash
undefined

macOS/Linux (Homebrew)

macOS/Linux(使用Homebrew)

brew install --cask codeql
brew install --cask codeql

Update

更新

brew upgrade codeql

Manual: Download bundle from https://github.com/github/codeql-action/releases
brew upgrade codeql

手动安装:从https://github.com/github/codeql-action/releases下载安装包

Trail of Bits Queries (Optional)

Trail of Bits查询包(可选)

Install public ToB security queries for additional coverage:
bash
undefined
安装公开的ToB安全查询包以扩展分析覆盖范围:
bash
undefined

Download ToB query packs

下载ToB查询包

codeql pack download trailofbits/cpp-queries trailofbits/go-queries
codeql pack download trailofbits/cpp-queries trailofbits/go-queries

Verify installation

验证安装

codeql resolve qlpacks | grep trailofbits
undefined
codeql resolve qlpacks | grep trailofbits
undefined

Core Workflow

核心工作流程

1. Create Database

1. 创建数据库

bash
codeql database create codeql.db --language=<LANG> [--command='<BUILD>'] --source-root=.
Language
--language=
Build Required
Python
python
No
JavaScript/TypeScript
javascript
No
Go
go
No
Ruby
ruby
No
Rust
rust
Yes (
--command='cargo build'
)
Java/Kotlin
java
Yes (
--command='./gradlew build'
)
C/C++
cpp
Yes (
--command='make -j8'
)
C#
csharp
Yes (
--command='dotnet build'
)
Swift
swift
Yes (macOS only)
bash
codeql database create codeql.db --language=<LANG> [--command='<BUILD>'] --source-root=.
语言
--language=
参数值
是否需要编译
Python
python
JavaScript/TypeScript
javascript
Go
go
Ruby
ruby
Rust
rust
是(
--command='cargo build'
Java/Kotlin
java
是(
--command='./gradlew build'
C/C++
cpp
是(
--command='make -j8'
C#
csharp
是(
--command='dotnet build'
Swift
swift
是(仅支持macOS)

2. Run Analysis

2. 运行分析

bash
undefined
bash
undefined

List available query packs

列出可用的查询包

codeql resolve qlpacks

**Run security queries:**

```bash
codeql resolve qlpacks

**运行安全查询:**

```bash

SARIF output (recommended)

输出SARIF格式(推荐)

codeql database analyze codeql.db
--format=sarif-latest
--output=results.sarif
-- codeql/python-queries:codeql-suites/python-security-extended.qls
codeql database analyze codeql.db
--format=sarif-latest
--output=results.sarif
-- codeql/python-queries:codeql-suites/python-security-extended.qls

CSV output

输出CSV格式

codeql database analyze codeql.db
--format=csv
--output=results.csv
-- codeql/javascript-queries

**With Trail of Bits queries (if installed):**

```bash
codeql database analyze codeql.db \
  --format=sarif-latest \
  --output=results.sarif \
  -- trailofbits/go-queries
codeql database analyze codeql.db
--format=csv
--output=results.csv
-- codeql/javascript-queries

**使用Trail of Bits查询包(已安装时):**

```bash
codeql database analyze codeql.db \
  --format=sarif-latest \
  --output=results.sarif \
  -- trailofbits/go-queries

Writing Custom Queries

编写自定义查询

Query Structure

查询结构

CodeQL uses SQL-like syntax:
from Type x where P(x) select f(x)
CodeQL使用类SQL语法:
from Type x where P(x) select f(x)

Basic Template

基础模板

ql
/**
 * @name Find SQL injection vulnerabilities
 * @description Identifies potential SQL injection from user input
 * @kind path-problem
 * @problem.severity error
 * @security-severity 9.0
 * @precision high
 * @id py/sql-injection
 * @tags security
 *       external/cwe/cwe-089
 */

import python
import semmle.python.dataflow.new.DataFlow
import semmle.python.dataflow.new.TaintTracking

module SqlInjectionConfig implements DataFlow::ConfigSig {
  predicate isSource(DataFlow::Node source) {
    // Define taint sources (user input)
    exists(source)
  }

  predicate isSink(DataFlow::Node sink) {
    // Define dangerous sinks (SQL execution)
    exists(sink)
  }
}

module SqlInjectionFlow = TaintTracking::Global<SqlInjectionConfig>;

from SqlInjectionFlow::PathNode source, SqlInjectionFlow::PathNode sink
where SqlInjectionFlow::flowPath(source, sink)
select sink.getNode(), source, sink, "SQL injection from $@.", source.getNode(), "user input"
ql
/**
 * @name 查找SQL注入漏洞
 * @description 识别来自用户输入的潜在SQL注入风险
 * @kind path-problem
 * @problem.severity error
 * @security-severity 9.0
 * @precision high
 * @id py/sql-injection
 * @tags security
 *       external/cwe/cwe-089
 */

import python
import semmle.python.dataflow.new.DataFlow
import semmle.python.dataflow.new.TaintTracking

module SqlInjectionConfig implements DataFlow::ConfigSig {
  predicate isSource(DataFlow::Node source) {
    // 定义污点源(用户输入)
    exists(source)
  }

  predicate isSink(DataFlow::Node sink) {
    // 定义危险输出点(SQL执行)
    exists(sink)
  }
}

module SqlInjectionFlow = TaintTracking::Global<SqlInjectionConfig>;

from SqlInjectionFlow::PathNode source, SqlInjectionFlow::PathNode sink
where SqlInjectionFlow::flowPath(source, sink)
select sink.getNode(), source, sink, "SQL injection from $@.", source.getNode(), "user input"

Query Metadata

查询元数据

FieldDescriptionValues
@kind
Query type
problem
,
path-problem
@problem.severity
Issue severity
error
,
warning
,
recommendation
@security-severity
CVSS score
0.0
-
10.0
@precision
Confidence
very-high
,
high
,
medium
,
low
字段描述可选值
@kind
查询类型
problem
,
path-problem
@problem.severity
问题严重程度
error
,
warning
,
recommendation
@security-severity
CVSS评分
0.0
-
10.0
@precision
置信度
very-high
,
high
,
medium
,
low

Key Language Features

关键语言特性

ql
// Predicates
predicate isUserInput(DataFlow::Node node) {
  exists(Call c | c.getFunc().(Attribute).getName() = "get" and node.asExpr() = c)
}

// Transitive closure: + (one or more), * (zero or more)
node.getASuccessor+()

// Quantification
exists(Variable v | v.getName() = "password")
forall(Call c | c.getTarget().hasName("dangerous") | hasCheck(c))
ql
// 谓词
predicate isUserInput(DataFlow::Node node) {
  exists(Call c | c.getFunc().(Attribute).getName() = "get" and node.asExpr() = c)
}

// 传递闭包:+(一次或多次),*(零次或多次)
node.getASuccessor+()

// 量化
exists(Variable v | v.getName() = "password")
forall(Call c | c.getTarget().hasName("dangerous") | hasCheck(c))

Creating Query Packs

创建查询包

bash
codeql pack init myorg/security-queries
Structure:
myorg-security-queries/
├── qlpack.yml
├── src/
│   └── SqlInjection.ql
└── test/
    └── SqlInjectionTest.expected
qlpack.yml:
yaml
name: myorg/security-queries
version: 1.0.0
dependencies:
  codeql/python-all: "*"
bash
codeql pack init myorg/security-queries
结构:
myorg-security-queries/
├── qlpack.yml
├── src/
│   └── SqlInjection.ql
└── test/
    └── SqlInjectionTest.expected
qlpack.yml文件内容:
yaml
name: myorg/security-queries
version: 1.0.0
dependencies:
  codeql/python-all: "*"

CI/CD Integration (GitHub Actions)

CI/CD集成(GitHub Actions)

yaml
name: CodeQL Analysis

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]
  schedule:
    - cron: '0 0 * * 1'  # Weekly

jobs:
  analyze:
    runs-on: ubuntu-latest
    permissions:
      actions: read
      contents: read
      security-events: write

    strategy:
      matrix:
        language: ['python', 'javascript']

    steps:
      - uses: actions/checkout@v4

      - name: Initialize CodeQL
        uses: github/codeql-action/init@v3
        with:
          languages: ${{ matrix.language }}
          queries: security-extended,security-and-quality
          # Add custom queries/packs:
          # queries: security-extended,./codeql/custom-queries
          # packs: trailofbits/python-queries

      - uses: github/codeql-action/autobuild@v3

      - uses: github/codeql-action/analyze@v3
        with:
          category: "/language:${{ matrix.language }}"
yaml
name: CodeQL分析

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]
  schedule:
    - cron: '0 0 * * 1'  # 每周执行一次

jobs:
  analyze:
    runs-on: ubuntu-latest
    permissions:
      actions: read
      contents: read
      security-events: write

    strategy:
      matrix:
        language: ['python', 'javascript']

    steps:
      - uses: actions/checkout@v4

      - name: 初始化CodeQL
        uses: github/codeql-action/init@v3
        with:
          languages: ${{ matrix.language }}
          queries: security-extended,security-and-quality
          # 添加自定义查询/查询包:
          # queries: security-extended,./codeql/custom-queries
          # packs: trailofbits/python-queries

      - uses: github/codeql-action/autobuild@v3

      - uses: github/codeql-action/analyze@v3
        with:
          category: "/language:${{ matrix.language }}"

Testing Queries

测试查询

bash
codeql test run test/
Test file format:
python
def vulnerable():
    user_input = request.args.get("q")  # Source
    cursor.execute("SELECT * FROM users WHERE id = " + user_input)  # Alert: sql-injection

def safe():
    user_input = request.args.get("q")
    cursor.execute("SELECT * FROM users WHERE id = ?", (user_input,))  # OK
bash
codeql test run test/
测试文件格式:
python
def vulnerable():
    user_input = request.args.get("q")  # 污点源
    cursor.execute("SELECT * FROM users WHERE id = " + user_input)  # 告警:SQL注入

def safe():
    user_input = request.args.get("q")
    cursor.execute("SELECT * FROM users WHERE id = ?", (user_input,))  # 安全

Troubleshooting

故障排除

IssueSolution
Database creation failsClean build environment, verify build command works independently
Slow analysisUse
--threads
, narrow query scope, check query complexity
Missing resultsCheck file exclusions, verify source files were parsed
Out of memorySet
CODEQL_RAM=48000
environment variable (48GB)
CMake source path issuesAdjust
--source-root
to point to actual source location
问题解决方案
数据库创建失败清理编译环境,验证编译命令可独立执行
分析速度慢使用
--threads
参数、缩小查询范围、检查查询复杂度
结果缺失检查文件排除规则,验证源代码文件已被解析
内存不足设置环境变量
CODEQL_RAM=48000
(即48GB内存)
CMake源路径问题调整
--source-root
参数指向实际源代码位置

Rationalizations to Reject

需摒弃的错误观点

ShortcutWhy It's Wrong
"No findings means the code is secure"CodeQL only finds patterns it has queries for; novel vulnerabilities won't be detected
"This code path looks safe"Complex data flow can hide vulnerabilities across 5+ function calls; trace the full path
"Small change, low risk"Small changes can introduce critical bugs; run full analysis on every change
"Tests pass so it's safe"Tests prove behavior, not absence of vulnerabilities; they test expected paths, not attacker paths
"The query didn't flag it"Default query suites don't cover everything; check if custom queries are needed for your domain
错误观点错误原因
"没有检测结果意味着代码是安全的"CodeQL只能检测其查询包覆盖的漏洞模式;新型漏洞无法被检测到
"这个代码路径看起来是安全的"复杂的数据流可能在5次以上的函数调用中隐藏漏洞;需要追踪完整路径
"改动小,风险低"微小改动也可能引入严重漏洞;每次改动都应执行完整分析
"测试通过意味着代码安全"测试仅验证行为是否符合预期,无法证明不存在漏洞;测试覆盖的是预期路径,而非攻击者可能利用的路径
"查询包没有标记问题"默认查询包无法覆盖所有场景;需根据业务领域判断是否需要自定义查询

Resources

资源