ln-514-test-log-analyzer

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese
Paths: File paths (
shared/
,
references/
,
../ln-*
) are relative to skills repo root. If not found at CWD, locate this SKILL.md directory and go up one level for repo root.
路径说明: 文件路径(
shared/
references/
../ln-*
)是相对于技能仓库根目录的。如果在当前工作目录(CWD)中未找到,请定位到本SKILL.md所在目录,然后向上一级目录即为仓库根目录。

Test Log Analyzer

测试日志分析器

Two-layer analysis of application logs. Python script handles collection and quantitative analysis; AI handles classification, quality assessment, and fix recommendations.
对应用日志进行双层分析。Python脚本负责日志收集与定量分析;AI负责错误分类、质量评估以及修复建议生成。

Inputs

输入参数

No required inputs. Runs in current project directory, auto-detects log sources.
Optional
args
— caller instructions (natural language): time window, expected errors, test context. Example:
"review logs for last 30min, auth 401 errors expected from negative tests"
.
无必填输入项。在当前项目目录中运行,自动检测日志来源。
可选参数
args
— 调用者的自然语言指令:时间窗口、预期错误、测试上下文。示例:
"查看最近30分钟的日志,负面测试中预期会出现认证401错误"

Purpose & Scope

目标与范围

  • Analyze application logs (after test runs, during development, or on demand)
  • Classify errors into 4 categories: Real Bug, Test Artifact, Expected Behavior, Operational Warning
  • Assess log quality: noisiness, completeness, level correctness, format, structured logging
  • Map stack traces to source files; provide fix recommendations
  • Report findings for quality verdict (only Real Bugs block)
  • No status changes or task creation — report only
  • 分析应用日志(测试运行后、开发过程中或按需触发)
  • 将错误分为4类:真实Bug、测试产物、预期行为、运维警告
  • 评估日志质量:冗余度、完整性、级别正确性、格式、结构化日志情况
  • 将堆栈跟踪映射到源文件;提供修复建议
  • 输出评估结果(仅真实Bug会导致流程阻塞)
  • 不修改状态或创建任务 — 仅生成报告

When to Use

使用场景

  • Analyze application logs in any project (default: last 1h)
  • After test runs to classify errors and assess log quality
  • Can be invoked with context instructions:
    Skill(skill: "ln-514-test-log-analyzer", args: "review last 30min, 401 errors expected")
  • 分析任意项目中的应用日志(默认:最近1小时)
  • 测试运行后,对错误进行分类并评估日志质量
  • 可携带上下文指令调用:
    Skill(skill: "ln-514-test-log-analyzer", args: "查看最近30分钟的日志,预期会出现401错误")

Workflow

工作流程

Phase 0: Parse Instructions

阶段0:解析指令

If
args
provided — extract: time window (default: 1h), expected errors list, test context. If no
args
— use defaults (last 1h, no expected errors).
若提供
args
— 提取以下信息:时间窗口(默认:1小时)、预期错误列表、测试上下文。 若未提供
args
— 使用默认值(最近1小时,无预期错误)。

Phase 1: Log Source Detection and Script Execution

阶段1:日志来源检测与脚本执行

MANDATORY READ: Load
docs/project/infrastructure.md
,
docs/project/runbook.md
  1. Check if
    scripts/analyze_test_logs.py
    exists in target project. If missing, copy from
    references/analyze_test_logs.py
    .
  2. Detect log source mode (auto-detection priority: docker → file → loki):
ModeDetectionSource
docker
docker compose ps
returns running containers
docker compose logs --since {window}
file
.log
files exist, or
tests/manual/results/
has output
File paths from infrastructure.md or
*.log
glob
loki
LOKI_URL
env var or
tools_config.md
observability section
Loki HTTP query_range API
  1. Run script:
    python scripts/analyze_test_logs.py --mode {detected} [options]
  2. If no log sources found → return
    NO_LOG_SOURCES
    status, skip to Phase 5.
必读要求: 加载
docs/project/infrastructure.md
docs/project/runbook.md
  1. 检查目标项目中是否存在
    scripts/analyze_test_logs.py
    。若缺失,从
    references/analyze_test_logs.py
    复制。
  2. 检测日志来源模式(自动检测优先级:Docker → 文件 → Loki):
模式检测方式来源
docker
docker compose ps
返回运行中的容器
docker compose logs --since {window}
file
存在
.log
文件,或
tests/manual/results/
有输出内容
来自infrastructure.md的文件路径或
*.log
通配符
loki
存在
LOKI_URL
环境变量或
tools_config.md
中的可观测性章节
Loki HTTP query_range API
  1. 运行脚本:
    python scripts/analyze_test_logs.py --mode {detected} [options]
  2. 若未找到日志来源 → 返回
    NO_LOG_SOURCES
    状态,直接进入阶段5。

Phase 2: 4-Category Error Classification

阶段2:四类错误分类

Classify each error group from script JSON output:
CategoryActionCriteria
Real BugFixUnexpected crash, data loss, broken pipeline
Test ArtifactSkipFrom test scripts, deliberate error-path validation
Expected BehaviorSkipRate limiting, input validation, auth failures from invalid tokens
Operational WarningMonitorClock drift, resource pressure, temporary unavailability
Test artifact detection heuristics:
  • Test name contains:
    invalid
    ,
    error
    ,
    fail
    ,
    reject
    ,
    unauthorized
    ,
    forbidden
    ,
    not_found
    ,
    bad_request
    ,
    timeout
  • Test asserts non-2xx status codes (4xx, 5xx)
  • Test uses
    pytest.raises
    ,
    expect(...).rejects
    ,
    assertThrows
    ,
    should.throw
  • Errors correlate with test execution timestamps from regression test output
  • Patterns matching
    tests/manual/
    scripts
Error taxonomy per
references/error_taxonomy.md
(9 categories: CRASH, TIMEOUT, AUTH, DB, NETWORK, VALIDATION, CONFIG, RESOURCE, DEPRECATION).
对脚本输出的JSON中的每个错误组进行分类:
类别处理方式判断标准
真实Bug修复意外崩溃、数据丢失、流水线中断
测试产物忽略来自测试脚本,为验证错误路径而刻意生成
预期行为忽略速率限制、输入验证、无效令牌导致的认证失败
运维警告监控时钟漂移、资源压力、临时不可用
测试产物检测规则:
  • 测试名称包含:
    invalid
    error
    fail
    reject
    unauthorized
    forbidden
    not_found
    bad_request
    timeout
  • 测试断言非2xx状态码(4xx、5xx)
  • 测试使用
    pytest.raises
    expect(...).rejects
    assertThrows
    should.throw
  • 错误与回归测试输出中的执行时间戳相关
  • 匹配
    tests/manual/
    目录下脚本的模式
错误分类参考
references/error_taxonomy.md
(9大类:CRASH、TIMEOUT、AUTH、DB、NETWORK、VALIDATION、CONFIG、RESOURCE、DEPRECATION)

Phase 3: Log Quality Assessment

阶段3:日志质量评估

MANDATORY READ: Load
references/error_taxonomy.md
(per-level criteria table + level correctness reference)
Step 1: Detect configured log level. Check in order:
  1. LOG_LEVEL
    /
    LOGLEVEL
    env var (
    .env
    ,
    docker-compose.yml
    ,
    infrastructure.md
    )
  2. Framework config: Python
    logging.conf
    / Django
    LOGGING
    / Node
    LOG_LEVEL
  3. Default: assume
    INFO
    if not detected
Configured level determines WHICH levels appear in logs, but each level has its own noise threshold regardless.
Step 2: Assess 6 quality dimensions:
DimensionWhat to CheckSignal
NoisinessPer-level noise thresholds from
error_taxonomy.md
section 4: TRACE (zero in prod), DEBUG (>50% monopoly), INFO (>30%), WARNING (>1% of total), ERROR (>0.1% of total)
NOISY: {level} template "{msg}" at {ratio}%
Completeness & TraceabilityCritical operations missing log entries + traceability gaps (see table below)
MISSING: No log for {operation}
/
TRACEABILITY_GAP: {type} in {file}:{line}
Level correctnessPer-level criteria from
error_taxonomy.md
section 4: content, anti-patterns, library rule
WRONG_LEVEL: should be {level}
Structured loggingMissing trace_id/request_id/user context; unstructured plaintext
UNSTRUCTURED: lacks {field}
SensitivityPII/secrets/tokens/passwords in log messages
SENSITIVE: {type} exposure
Context richnessErrors without actionable context (order_id, user_id, operation)
LOW_CONTEXT: lacks context
Traceability gap detection — scan source code for operations without INFO-level logging:
Operation TypeExpected LogWhere to Add
Incoming request handlingRequest received + response statusEntry/exit of route handler
External API callRequest sent + response status + durationBefore/after HTTP client call
DB write (INSERT/UPDATE/DELETE)Operation + affected entity + countBefore/after ORM/query call
Auth decisionResult (allow/deny) + reasonAfter auth check
State transitionOld state → new state + triggerAt transition point
Background jobStart + complete/fail + durationEntry/exit of job handler
File/resource operationOpen/close + path + sizeAt I/O operation
Log Format Quality (10-criterion checklist per
references/log_analysis_output_format.md
):
#CriterionCheck
1Dual formatJSON in prod, readable in dev
2TimestampConsistent, timezone-aware
3Level fieldPresent, uppercase
4Trace/Correlation IDPresent in every entry, async-safe
5Service nameIdentifies source service
6Source locationmodule:line + function
7Extra contextStructured fields, not string interpolation
8PII redactionPasswords, API keys, emails handled
9Noise suppressionDuplicate filters, third-party suppressed
10ParseabilityDev: pipe-delimited; prod: valid JSON per line
Score: passed criteria / 10.
必读要求: 加载
references/error_taxonomy.md
(包含各级别判断标准表 + 级别正确性参考)
步骤1:检测配置的日志级别。按以下顺序检查:
  1. LOG_LEVEL
    /
    LOGLEVEL
    环境变量(
    .env
    docker-compose.yml
    infrastructure.md
  2. 框架配置:Python
    logging.conf
    / Django
    LOGGING
    / Node
    LOG_LEVEL
  3. 默认值:若未检测到,假设为
    INFO
配置的级别决定了哪些级别的日志会出现在输出中,但每个级别都有独立的冗余阈值。
步骤2:评估6个质量维度:
维度检查内容提示信息
冗余度参考
error_taxonomy.md
第4节的各级别冗余阈值:生产环境TRACE级别应为0,DEBUG级别占比>50%、INFO级别占比>30%、WARNING级别占比>1%、ERROR级别占比>0.1%
NOISY: {level} 模板 "{msg}" 占比 {ratio}%
完整性与可追踪性关键操作缺失日志条目 + 可追踪性缺口(见下表)
MISSING: 无{operation}相关日志
/
TRACEABILITY_GAP: {type} 在 {file}:{line}
级别正确性参考
error_taxonomy.md
第4节的各级别判断标准:内容、反模式、库规则
WRONG_LEVEL: 应设置为{level}
结构化日志缺失trace_id/request_id/用户上下文;非结构化纯文本
UNSTRUCTURED: 缺少{field}字段
敏感性日志消息中包含PII/密钥/令牌/密码
SENSITIVE: {type} 信息泄露
上下文丰富度错误日志缺少可操作上下文(order_id、user_id、操作类型)
LOW_CONTEXT: 上下文不足
可追踪性缺口检测 — 扫描源代码,检查是否有未记录INFO级日志的关键操作:
操作类型预期日志内容应添加位置
入站请求处理请求接收 + 响应状态路由处理器的入口/出口
外部API调用请求发送 + 响应状态 + 耗时HTTP客户端调用的前后
数据库写入(INSERT/UPDATE/DELETE)操作类型 + 受影响实体 + 数量ORM/查询调用的前后
认证决策结果(允许/拒绝) + 原因认证检查之后
状态转换旧状态 → 新状态 + 触发条件状态转换节点
后台任务启动 + 完成/失败 + 耗时任务处理器的入口/出口
文件/资源操作打开/关闭 + 路径 + 大小I/O操作节点
日志格式质量(参考
references/log_analysis_output_format.md
的10项检查清单):
序号检查项验证内容
1双格式支持生产环境用JSON,开发环境用易读格式
2时间戳格式一致、带时区信息
3级别字段存在且为大写
4跟踪/关联ID每条日志都包含,支持异步场景
5服务名称可标识日志来源服务
6来源位置模块:行号 + 函数名
7额外上下文结构化字段,而非字符串插值
8PII脱敏密码、API密钥、邮箱已处理
9冗余抑制重复日志过滤、第三方日志抑制
10可解析性开发环境:竖线分隔;生产环境:每行都是合法JSON
评分:通过项数 / 10。

Phase 4: Stack Trace Mapping + Fix Recommendations

阶段4:堆栈跟踪映射 + 修复建议

For each Real Bug:
  1. Extract stack trace frames; identify origin frame (first frame in project code, not in node_modules/site-packages)
  2. Map to source file:line
  3. Generate fix recommendation: what to change, where, effort estimate (S/M/L)
Prioritize using Sentry-inspired dimensions:
  • High-volume (occurrence count), Post-test regression (new errors), High-impact path (auth/payment/DB), Correlated traces (trace_id across services)
针对每个真实Bug:
  1. 提取堆栈跟踪帧;识别原始帧(项目代码中的第一帧,而非node_modules/site-packages中的帧)
  2. 映射到源文件:行号
  3. 生成修复建议:修改内容、位置、工作量预估(S/M/L)
参考Sentry的优先级维度排序:
  • 高出现频次、测试后新增的回归错误、高影响路径(认证/支付/数据库)、跨服务关联跟踪(trace_id跨服务)

Phase 5: Generate Report

阶段5:生成报告

MANDATORY READ: Load
references/log_analysis_output_format.md
Output report to chat with header
## Test Log Analysis
. Include:
  • Signals table (Real Bugs count, Test Artifacts filtered, Log Noise status, Log Format score, Log Quality score)
  • Real Bugs table (priority, category, error, source, fix recommendation)
  • Filtered table (category, count, examples)
  • Log Quality Issues table (dimension, service, issue, recommendation)
  • Noise Report table (count, ratio, service, level, template, action)
  • Machine-readable block
    <!-- LOG-ANALYSIS-DATA ... -->
    for programmatic consumption
必读要求: 加载
references/log_analysis_output_format.md
在聊天窗口输出报告,标题为
## 测试日志分析结果
。报告包含:
  • 概览表(真实Bug数量、已过滤的测试产物数量、日志冗余状态、日志格式得分、日志质量得分)
  • 真实Bug表(优先级、类别、错误信息、来源、修复建议)
  • 已过滤项表(类别、数量、示例)
  • 日志质量问题表(维度、服务、问题、建议)
  • 冗余报告表(数量、占比、服务、级别、模板、操作建议)
  • 机器可读块
    <!-- LOG-ANALYSIS-DATA ... -->
    ,供程序调用

Phase 6: Meta-Analysis

阶段6:元分析

MANDATORY READ: Load
shared/references/meta_analysis_protocol.md
Skill type:
execution-worker
. Run after all phases complete.
必读要求: 加载
shared/references/meta_analysis_protocol.md
技能类型:
execution-worker
。在所有阶段完成后运行。

Verdict Contribution

评估结果影响

Quality coordinator normalization matrix component:
StatusMaps ToPenalty
CLEAN--0
WARNINGS_ONLY--0
REAL_BUGS_FOUNDFAIL-20
SKIPPED / NO_LOG_SOURCESignored0
Log quality/format issues are INFORMATIONAL — do not affect quality verdict. Only Real Bugs block.
质量协调器标准化矩阵组件:
状态映射值惩罚分
CLEAN--0
WARNINGS_ONLY--0
REAL_BUGS_FOUNDFAIL-20
SKIPPED / NO_LOG_SOURCES忽略0
日志质量/格式问题仅为信息性提示 — 不影响质量评估结果。仅真实Bug会导致流程阻塞。

Critical Rules

核心规则

  • No status changes or task creation; report only.
  • Test Artifacts and Expected Behavior are ALWAYS filtered — never count as bugs.
  • Log quality issues are advisory — inform, don't block.
  • Script must handle gracefully: no Docker, no log files, no Loki →
    NO_LOG_SOURCES
    .
  • Language preservation in comments (EN/RU).
  • 不修改状态或创建任务;仅生成报告。
  • 测试产物和预期行为始终会被过滤 — 永远不会被计为Bug。
  • 日志质量问题仅作为建议 — 仅告知,不阻塞流程。
  • 脚本需优雅处理异常:无Docker、无日志文件、无Loki → 返回
    NO_LOG_SOURCES
  • 注释保留原语言(英文/俄文)。

Definition of Done

完成标准

  • Script deployed to target project
    scripts/
    (or already exists).
  • Log source detected and script executed (or NO_LOG_SOURCES returned).
  • Errors classified into 4 categories; Real Bugs identified.
  • Log quality assessed (6 dimensions + 10-criterion format checklist).
  • Stack traces mapped to source files for Real Bugs.
  • Report output to chat with signals table + machine-readable block.
  • 脚本已部署到目标项目的
    scripts/
    目录(或已存在)。
  • 已检测到日志来源并执行脚本(或返回NO_LOG_SOURCES)。
  • 错误已分为4类;已识别真实Bug。
  • 已完成日志质量评估(6个维度 + 10项格式检查清单)。
  • 已将真实Bug的堆栈跟踪映射到源文件。
  • 已在聊天窗口输出包含概览表 + 机器可读块的报告。

Reference Files

参考文件

  • Error taxonomy:
    references/error_taxonomy.md
  • Output format:
    references/log_analysis_output_format.md
  • Analysis script:
    references/analyze_test_logs.py

Version: 1.0.0 Last Updated: 2026-03-13
  • 错误分类标准:
    references/error_taxonomy.md
  • 输出格式:
    references/log_analysis_output_format.md
  • 分析脚本:
    references/analyze_test_logs.py

版本: 1.0.0 最后更新时间: 2026-03-13