ln-514-test-log-analyzer

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Paths: File paths (
shared/
,
references/
,
../ln-*
) are relative to skills repo root. If not found at CWD, locate this SKILL.md directory and go up one level for repo root.

路径说明： 文件路径（
shared/
、
references/
、
../ln-*
）是相对于技能仓库根目录的。如果在当前工作目录（CWD）中未找到，请定位到本SKILL.md所在目录，然后向上一级目录即为仓库根目录。

Test Log Analyzer

测试日志分析器

Two-layer analysis of application logs. Python script handles collection and quantitative analysis; AI handles classification, quality assessment, and fix recommendations.

对应用日志进行双层分析。Python脚本负责日志收集与定量分析；AI负责错误分类、质量评估以及修复建议生成。

Inputs

输入参数

No required inputs. Runs in current project directory, auto-detects log sources.

Optional

args

— caller instructions (natural language): time window, expected errors, test context. Example:

"review logs for last 30min, auth 401 errors expected from negative tests"

无必填输入项。在当前项目目录中运行，自动检测日志来源。

可选参数

args

— 调用者的自然语言指令：时间窗口、预期错误、测试上下文。示例：

"查看最近30分钟的日志，负面测试中预期会出现认证401错误"

。

Purpose & Scope

目标与范围

Analyze application logs (after test runs, during development, or on demand)
Classify errors into 4 categories: Real Bug, Test Artifact, Expected Behavior, Operational Warning
Assess log quality: noisiness, completeness, level correctness, format, structured logging
Map stack traces to source files; provide fix recommendations
Report findings for quality verdict (only Real Bugs block)
No status changes or task creation — report only

分析应用日志（测试运行后、开发过程中或按需触发）
将错误分为4类：真实Bug、测试产物、预期行为、运维警告
评估日志质量：冗余度、完整性、级别正确性、格式、结构化日志情况
将堆栈跟踪映射到源文件；提供修复建议
输出评估结果（仅真实Bug会导致流程阻塞）
不修改状态或创建任务 — 仅生成报告

When to Use

使用场景

Analyze application logs in any project (default: last 1h)
After test runs to classify errors and assess log quality

Can be invoked with context instructions:

Skill(skill: "ln-514-test-log-analyzer", args: "review last 30min, 401 errors expected")

分析任意项目中的应用日志（默认：最近1小时）
测试运行后，对错误进行分类并评估日志质量

可携带上下文指令调用：

Skill(skill: "ln-514-test-log-analyzer", args: "查看最近30分钟的日志，预期会出现401错误")

Workflow

工作流程

Phase 0: Parse Instructions

阶段0：解析指令

args

provided — extract: time window (default: 1h), expected errors list, test context. If no

args

— use defaults (last 1h, no expected errors).

若提供

args

— 提取以下信息：时间窗口（默认：1小时）、预期错误列表、测试上下文。若未提供

args

— 使用默认值（最近1小时，无预期错误）。

Phase 1: Log Source Detection and Script Execution

阶段1：日志来源检测与脚本执行

MANDATORY READ: Load

docs/project/infrastructure.md

docs/project/runbook.md

Check if

scripts/analyze_test_logs.py

exists in target project. If missing, copy from

references/analyze_test_logs.py

Detect log source mode (auto-detection priority: docker → file → loki):

Mode	Detection	Source
`docker`	`docker compose ps` returns running containers	`docker compose logs --since {window}`
`file`	`.log` files exist, or `tests/manual/results/` has output	File paths from infrastructure.md or `*.log` glob
`loki`	`LOKI_URL` env var or `tools_config.md` observability section	Loki HTTP query_range API

Run script:

python scripts/analyze_test_logs.py --mode {detected} [options]

If no log sources found → return
```
NO_LOG_SOURCES
```
status, skip to Phase 5.

必读要求： 加载

docs/project/infrastructure.md

、

docs/project/runbook.md

检查目标项目中是否存在
```
scripts/analyze_test_logs.py
```
。若缺失，从
```
references/analyze_test_logs.py
```
复制。
检测日志来源模式（自动检测优先级：Docker → 文件 → Loki）：

模式	检测方式	来源
`docker`	`docker compose ps` 返回运行中的容器	`docker compose logs --since {window}`
`file`	存在 `.log` 文件，或 `tests/manual/results/` 有输出内容	来自infrastructure.md的文件路径或 `*.log` 通配符
`loki`	存在 `LOKI_URL` 环境变量或 `tools_config.md` 中的可观测性章节	Loki HTTP query_range API

运行脚本：

python scripts/analyze_test_logs.py --mode {detected} [options]

若未找到日志来源 → 返回
```
NO_LOG_SOURCES
```
状态，直接进入阶段5。

Phase 2: 4-Category Error Classification

阶段2：四类错误分类

Classify each error group from script JSON output:

Category	Action	Criteria
Real Bug	Fix	Unexpected crash, data loss, broken pipeline
Test Artifact	Skip	From test scripts, deliberate error-path validation
Expected Behavior	Skip	Rate limiting, input validation, auth failures from invalid tokens
Operational Warning	Monitor	Clock drift, resource pressure, temporary unavailability

Test artifact detection heuristics:

Test name contains:

invalid

error

fail

reject

unauthorized

forbidden

not_found

bad_request

timeout

Test asserts non-2xx status codes (4xx, 5xx)

Test uses

pytest.raises

expect(...).rejects

assertThrows

should.throw

Errors correlate with test execution timestamps from regression test output
Patterns matching
```
tests/manual/
```
scripts

Error taxonomy per

references/error_taxonomy.md

(9 categories: CRASH, TIMEOUT, AUTH, DB, NETWORK, VALIDATION, CONFIG, RESOURCE, DEPRECATION).

对脚本输出的JSON中的每个错误组进行分类：

类别	处理方式	判断标准
真实Bug	修复	意外崩溃、数据丢失、流水线中断
测试产物	忽略	来自测试脚本，为验证错误路径而刻意生成
预期行为	忽略	速率限制、输入验证、无效令牌导致的认证失败
运维警告	监控	时钟漂移、资源压力、临时不可用

测试产物检测规则：

测试名称包含：

invalid

、

error

、

fail

、

reject

、

unauthorized

、

forbidden

、

not_found

、

bad_request

、

timeout

测试断言非2xx状态码（4xx、5xx）

测试使用

pytest.raises

、

expect(...).rejects

、

assertThrows

、

should.throw

错误与回归测试输出中的执行时间戳相关
匹配
```
tests/manual/
```
目录下脚本的模式

错误分类参考

references/error_taxonomy.md

（9大类：CRASH、TIMEOUT、AUTH、DB、NETWORK、VALIDATION、CONFIG、RESOURCE、DEPRECATION）。

Phase 3: Log Quality Assessment

阶段3：日志质量评估

MANDATORY READ: Load

references/error_taxonomy.md

(per-level criteria table + level correctness reference)

Step 1: Detect configured log level. Check in order:

LOG_LEVEL

LOGLEVEL

env var (

.env

docker-compose.yml

infrastructure.md

)

Framework config: Python
```
logging.conf
```
/ Django
```
LOGGING
```
/ Node
```
LOG_LEVEL
```
Default: assume
```
INFO
```
if not detected

Configured level determines WHICH levels appear in logs, but each level has its own noise threshold regardless.

Step 2: Assess 6 quality dimensions:

Dimension	What to Check	Signal
Noisiness	Per-level noise thresholds from `error_taxonomy.md` section 4: TRACE (zero in prod), DEBUG (>50% monopoly), INFO (>30%), WARNING (>1% of total), ERROR (>0.1% of total)	`NOISY: {level} template "{msg}" at {ratio}%`
Completeness & Traceability	Critical operations missing log entries + traceability gaps (see table below)	`MISSING: No log for {operation}` / `TRACEABILITY_GAP: {type} in {file}:{line}`
Level correctness	Per-level criteria from `error_taxonomy.md` section 4: content, anti-patterns, library rule	`WRONG_LEVEL: should be {level}`
Structured logging	Missing trace_id/request_id/user context; unstructured plaintext	`UNSTRUCTURED: lacks {field}`
Sensitivity	PII/secrets/tokens/passwords in log messages	`SENSITIVE: {type} exposure`
Context richness	Errors without actionable context (order_id, user_id, operation)	`LOW_CONTEXT: lacks context`

Traceability gap detection — scan source code for operations without INFO-level logging:

Operation Type	Expected Log	Where to Add
Incoming request handling	Request received + response status	Entry/exit of route handler
External API call	Request sent + response status + duration	Before/after HTTP client call
DB write (INSERT/UPDATE/DELETE)	Operation + affected entity + count	Before/after ORM/query call
Auth decision	Result (allow/deny) + reason	After auth check
State transition	Old state → new state + trigger	At transition point
Background job	Start + complete/fail + duration	Entry/exit of job handler
File/resource operation	Open/close + path + size	At I/O operation

Log Format Quality (10-criterion checklist per

references/log_analysis_output_format.md

#	Criterion	Check
1	Dual format	JSON in prod, readable in dev
2	Timestamp	Consistent, timezone-aware
3	Level field	Present, uppercase
4	Trace/Correlation ID	Present in every entry, async-safe
5	Service name	Identifies source service
6	Source location	module:line + function
7	Extra context	Structured fields, not string interpolation
8	PII redaction	Passwords, API keys, emails handled
9	Noise suppression	Duplicate filters, third-party suppressed
10	Parseability	Dev: pipe-delimited; prod: valid JSON per line

Score: passed criteria / 10.

必读要求： 加载

references/error_taxonomy.md

（包含各级别判断标准表 + 级别正确性参考）

步骤1：检测配置的日志级别。按以下顺序检查：

LOG_LEVEL

LOGLEVEL

环境变量（

.env

、

docker-compose.yml

、

infrastructure.md

）

框架配置：Python
```
logging.conf
```
/ Django
```
LOGGING
```
/ Node
```
LOG_LEVEL
```
默认值：若未检测到，假设为
```
INFO
```

配置的级别决定了哪些级别的日志会出现在输出中，但每个级别都有独立的冗余阈值。

步骤2：评估6个质量维度：

维度	检查内容	提示信息
冗余度	参考 `error_taxonomy.md` 第4节的各级别冗余阈值：生产环境TRACE级别应为0，DEBUG级别占比>50%、INFO级别占比>30%、WARNING级别占比>1%、ERROR级别占比>0.1%	`NOISY: {level} 模板 "{msg}" 占比 {ratio}%`
完整性与可追踪性	关键操作缺失日志条目 + 可追踪性缺口（见下表）	`MISSING: 无{operation}相关日志` / `TRACEABILITY_GAP: {type} 在 {file}:{line}`
级别正确性	参考 `error_taxonomy.md` 第4节的各级别判断标准：内容、反模式、库规则	`WRONG_LEVEL: 应设置为{level}`
结构化日志	缺失trace_id/request_id/用户上下文；非结构化纯文本	`UNSTRUCTURED: 缺少{field}字段`
敏感性	日志消息中包含PII/密钥/令牌/密码	`SENSITIVE: {type} 信息泄露`
上下文丰富度	错误日志缺少可操作上下文（order_id、user_id、操作类型）	`LOW_CONTEXT: 上下文不足`

可追踪性缺口检测 — 扫描源代码，检查是否有未记录INFO级日志的关键操作：

操作类型	预期日志内容	应添加位置
入站请求处理	请求接收 + 响应状态	路由处理器的入口/出口
外部API调用	请求发送 + 响应状态 + 耗时	HTTP客户端调用的前后
数据库写入（INSERT/UPDATE/DELETE）	操作类型 + 受影响实体 + 数量	ORM/查询调用的前后
认证决策	结果（允许/拒绝） + 原因	认证检查之后
状态转换	旧状态 → 新状态 + 触发条件	状态转换节点
后台任务	启动 + 完成/失败 + 耗时	任务处理器的入口/出口
文件/资源操作	打开/关闭 + 路径 + 大小	I/O操作节点

日志格式质量（参考

references/log_analysis_output_format.md

的10项检查清单）：

序号	检查项	验证内容
1	双格式支持	生产环境用JSON，开发环境用易读格式
2	时间戳	格式一致、带时区信息
3	级别字段	存在且为大写
4	跟踪/关联ID	每条日志都包含，支持异步场景
5	服务名称	可标识日志来源服务
6	来源位置	模块:行号 + 函数名
7	额外上下文	结构化字段，而非字符串插值
8	PII脱敏	密码、API密钥、邮箱已处理
9	冗余抑制	重复日志过滤、第三方日志抑制
10	可解析性	开发环境：竖线分隔；生产环境：每行都是合法JSON

评分：通过项数 / 10。

Phase 4: Stack Trace Mapping + Fix Recommendations

阶段4：堆栈跟踪映射 + 修复建议

For each Real Bug:

Extract stack trace frames; identify origin frame (first frame in project code, not in node_modules/site-packages)
Map to source file:line
Generate fix recommendation: what to change, where, effort estimate (S/M/L)

Prioritize using Sentry-inspired dimensions:

High-volume (occurrence count), Post-test regression (new errors), High-impact path (auth/payment/DB), Correlated traces (trace_id across services)

针对每个真实Bug：

提取堆栈跟踪帧；识别原始帧（项目代码中的第一帧，而非node_modules/site-packages中的帧）
映射到源文件:行号
生成修复建议：修改内容、位置、工作量预估（S/M/L）

参考Sentry的优先级维度排序：

高出现频次、测试后新增的回归错误、高影响路径（认证/支付/数据库）、跨服务关联跟踪（trace_id跨服务）

Phase 5: Generate Report

阶段5：生成报告

MANDATORY READ: Load

references/log_analysis_output_format.md

Output report to chat with header

## Test Log Analysis

. Include:

Signals table (Real Bugs count, Test Artifacts filtered, Log Noise status, Log Format score, Log Quality score)
Real Bugs table (priority, category, error, source, fix recommendation)
Filtered table (category, count, examples)
Log Quality Issues table (dimension, service, issue, recommendation)
Noise Report table (count, ratio, service, level, template, action)
Machine-readable block
```

```
for programmatic consumption

必读要求： 加载

references/log_analysis_output_format.md

在聊天窗口输出报告，标题为

## 测试日志分析结果

。报告包含：

概览表（真实Bug数量、已过滤的测试产物数量、日志冗余状态、日志格式得分、日志质量得分）
真实Bug表（优先级、类别、错误信息、来源、修复建议）
已过滤项表（类别、数量、示例）
日志质量问题表（维度、服务、问题、建议）
冗余报告表（数量、占比、服务、级别、模板、操作建议）
机器可读块
```

```
，供程序调用

Phase 6: Meta-Analysis

阶段6：元分析

MANDATORY READ: Load

shared/references/meta_analysis_protocol.md

Skill type:

execution-worker

. Run after all phases complete.

必读要求： 加载

shared/references/meta_analysis_protocol.md

技能类型：

execution-worker

。在所有阶段完成后运行。

Verdict Contribution

评估结果影响

Quality coordinator normalization matrix component:

Status	Maps To	Penalty
CLEAN	--	0
WARNINGS_ONLY	--	0
REAL_BUGS_FOUND	FAIL	-20
SKIPPED / NO_LOG_SOURCES	ignored	0

Log quality/format issues are INFORMATIONAL — do not affect quality verdict. Only Real Bugs block.

质量协调器标准化矩阵组件：

状态	映射值	惩罚分
CLEAN	--	0
WARNINGS_ONLY	--	0
REAL_BUGS_FOUND	FAIL	-20
SKIPPED / NO_LOG_SOURCES	忽略	0

日志质量/格式问题仅为信息性提示 — 不影响质量评估结果。仅真实Bug会导致流程阻塞。

Critical Rules

核心规则

No status changes or task creation; report only.
Test Artifacts and Expected Behavior are ALWAYS filtered — never count as bugs.
Log quality issues are advisory — inform, don't block.
Script must handle gracefully: no Docker, no log files, no Loki →
```
NO_LOG_SOURCES
```
.
Language preservation in comments (EN/RU).

不修改状态或创建任务；仅生成报告。
测试产物和预期行为始终会被过滤 — 永远不会被计为Bug。
日志质量问题仅作为建议 — 仅告知，不阻塞流程。
脚本需优雅处理异常：无Docker、无日志文件、无Loki → 返回
```
NO_LOG_SOURCES
```
。
注释保留原语言（英文/俄文）。

Definition of Done

完成标准

Script deployed to target project
```
scripts/
```
(or already exists).
Log source detected and script executed (or NO_LOG_SOURCES returned).
Errors classified into 4 categories; Real Bugs identified.
Log quality assessed (6 dimensions + 10-criterion format checklist).
Stack traces mapped to source files for Real Bugs.
Report output to chat with signals table + machine-readable block.

脚本已部署到目标项目的
```
scripts/
```
目录（或已存在）。
已检测到日志来源并执行脚本（或返回NO_LOG_SOURCES）。
错误已分为4类；已识别真实Bug。
已完成日志质量评估（6个维度 + 10项格式检查清单）。
已将真实Bug的堆栈跟踪映射到源文件。
已在聊天窗口输出包含概览表 + 机器可读块的报告。

Reference Files

参考文件

Error taxonomy:
```
references/error_taxonomy.md
```

Output format:

references/log_analysis_output_format.md

Analysis script:
```
references/analyze_test_logs.py
```

Version: 1.0.0 Last Updated: 2026-03-13

错误分类标准：
```
references/error_taxonomy.md
```

输出格式：

references/log_analysis_output_format.md

分析脚本：
```
references/analyze_test_logs.py
```

版本： 1.0.0 最后更新时间： 2026-03-13