full-stack-debugger

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Full Stack Debugger

Overview

概述

The Full Stack Debugger enables systematic debugging of issues across the entire application stack (UI/Frontend, Backend/API, Database/State). It combines browser testing, log analysis, code examination, and automated server restart/verification to iteratively identify and fix issues one at a time until the system is fully operational.

This skill uses a proven workflow: Detection → Analysis → Fix → Restart → Verification → Iteration to systematically resolve issues that developers encounter during development and testing.

Full Stack Debugger 支持对整个应用栈（UI/前端、后端/API、数据库/状态）的问题进行系统化调试。它结合浏览器测试、日志分析、代码检查以及自动服务器重启/验证功能，逐一识别并修复问题，直到系统完全恢复正常运行。

此技能采用经过验证的工作流：检测 → 分析 → 修复 → 重启 → 验证 → 迭代，系统性解决开发和测试过程中开发者遇到的问题。

When to Use This Skill

何时使用此技能

Trigger this skill when observing:

Error states in the UI (dashboard, buttons failing, status showing errors)
Repeated failures in backend logs (task execution failures, import errors, database errors)
Unexpected database state (rows showing failed status when they should succeed)
API endpoints returning errors or unexpected responses
Services failing to initialize or process tasks
Cascading failures across multiple components

当观察到以下情况时，触发此技能：

UI中的错误状态（仪表板、按钮失效、状态显示错误）
后端日志中的重复故障（任务执行失败、导入错误、数据库错误）
异常的数据库状态（本应成功的记录显示失败状态）
API端点返回错误或异常响应
服务无法初始化或处理任务
跨多个组件的级联故障

Debugging Workflow

调试工作流

Phase 1: Detection

阶段1：检测

Detect errors from multiple sources:

Browser UI Detection:

Navigate to the affected page/feature in the browser
Check for error messages, red warning states, or disabled functionality
Read console error messages using DevTools
Note the specific UI state and what action triggered the error

Backend Log Detection:

Query recent error logs using
```
tail -200 /path/to/logs/errors.log
```
Search for error patterns related to the issue using
```
grep
```
Note error timestamps, error messages, and stack traces
Look for repeated errors (indicates systemic issue)

Database State Detection:

Query the database directly using sqlite3
Check status of recent tasks, transactions, or records
Look for failed, incomplete, or error states
Note which records are affected and what their states are

Example: When debugging a scheduler failure:

Navigate to System Health dashboard
Observe scheduler showing "0 done" or "X failed"
Check
```
/logs/errors.log
```
for error messages
Query
```
queue_tasks
```
table to see failed task records

从多个来源检测错误：

浏览器UI检测：

在浏览器中导航到受影响的页面/功能
检查错误消息、红色警告状态或禁用的功能
使用DevTools查看控制台错误消息
记录具体的UI状态以及触发错误的操作

后端日志检测：

使用
```
tail -200 /path/to/logs/errors.log
```
查询近期错误日志
使用
```
grep
```
搜索与问题相关的错误模式
记录错误时间戳、错误消息和堆栈跟踪
查找重复出现的错误（表明存在系统性问题）

数据库状态检测：

使用sqlite3直接查询数据库
检查近期任务、事务或记录的状态
查找失败、未完成或错误状态的记录
记录受影响的记录及其状态

示例：调试调度器故障时：

导航到系统健康仪表板
观察调度器显示“0完成”或“X失败”
查看
```
/logs/errors.log
```
中的错误消息
查询
```
queue_tasks
```
表查看失败的任务记录

Phase 2: Analysis

阶段2：分析

Analyze root causes by reading code and logs:

Code Analysis:

Read the error file/module indicated in error stack traces
Check imports - look for missing
```
from X import Y
```
statements
Check class names - verify instantiation matches actual class names
Look for syntax errors - unmatched quotes, unclosed parentheses
Check function signatures - ensure payloads match expected parameters
Read reference documentation (
```
references/common_errors.md
```
) for error patterns

Log Analysis:

Extract error messages from logs
Look for patterns like
```
'optional'
```
(missing import),
```
unterminated string
```
(syntax error),
```
'attribute'
```
(wrong class name)
Trace error propagation backward to find the originating issue
Check timestamps - multiple errors at same time indicate batch failure

API/Payload Analysis:

Check what payload the API is sending to task handlers
Read the task handler code to see what fields it expects
Compare actual payload vs expected payload
Look for missing required fields

Example: When debugging "name 'Optional' is not defined":

Find the file mentioned in error (
```
analysis_executor.py
```
)
Read the imports section
Notice
```
Optional
```
is used but not imported

Check line 14:

from typing import Dict, List, Any

- missing

Optional

Fix: Add
```
Optional
```
to the import statement

通过阅读代码和日志分析根本原因：

代码分析：

阅读错误堆栈跟踪中指明的错误文件/模块
检查导入语句——查找缺失的
```
from X import Y
```
语句
检查类名——验证实例化是否与实际类名匹配
查找语法错误——未闭合的引号、未结束的括号
检查函数签名——确保负载与预期参数匹配
查阅参考文档（
```
references/common_errors.md
```
）中的错误模式

日志分析：

从日志中提取错误消息
查找诸如
```
'optional'
```
（缺失导入）、
```
unterminated string
```
（语法错误）、
```
'attribute'
```
（错误类名）等模式
反向追踪错误传播路径，找到问题源头
检查时间戳——同一时间出现多个错误表明批量故障

API/负载分析：

检查API发送给任务处理器的负载
阅读任务处理器代码，查看其预期的字段
比较实际负载与预期负载
查找缺失的必填字段

示例：调试“name 'Optional' is not defined”错误时：

找到错误中提及的文件（
```
analysis_executor.py
```
）
阅读导入部分
注意到使用了
```
Optional
```
但未导入

查看第14行：

from typing import Dict, List, Any

——缺失

Optional

修复方法：在导入语句中添加
```
Optional
```

Phase 3: Fix (One Issue at a Time)

阶段3：修复（一次解决一个问题）

Apply fixes one issue per iteration:

Before Fixing:

Verify this is the first/next issue to fix
Read the relevant code section carefully
Use the fix patterns from
```
references/fix_templates.md
```

Common Fix Patterns:

Missing imports: Add to import statement (e.g.,
```
from typing import Optional
```
)
Wrong class name: Update import and instantiation to match actual class
Missing docstring quotes: Add opening
```
"""
```
to docstring
Wrong payload fields: Add missing required fields to payload dictionary
Syntax errors: Fix unmatched quotes, parentheses, brackets

After Fixing:

Read back the changed code to verify syntax
Check the edit was correct (line numbers, indentation)
Only fix ONE issue, even if multiple exist - don't cascade fixes
Document what was changed in a clear comment

Example Fix:

python

undefined

每次迭代仅修复一个问题：

修复前：

确认这是第一个/下一个需要修复的问题
仔细阅读相关代码段
使用
```
references/fix_templates.md
```
中的修复模式

常见修复模式：

缺失导入： 添加到导入语句中（例如：
```
from typing import Optional
```
）
错误类名： 更新导入和实例化代码以匹配实际类名
缺失文档字符串引号： 为文档字符串添加开头的
```
"""
```
错误负载字段： 向负载字典中添加缺失的必填字段
语法错误： 修复未闭合的引号、括号、方括号

修复后：

重新阅读修改后的代码以验证语法正确性
检查编辑是否正确（行号、缩进）
即使存在多个问题，也仅修复一个——不要进行级联修复
用清晰的注释记录所做的修改

修复示例：

python

undefined

BEFORE

from typing import Dict, List, Any

AFTER

from typing import Dict, List, Any, Optional

undefined

from typing import Dict, List, Any, Optional

undefined

Phase 4: Restart (Automated)

阶段4：重启（自动化）

Restart the backend server after each fix:

bash

undefined

每次修复后重启后端服务器：

bash

undefined

Kill existing processes

lsof -ti:8000 | xargs kill -9 2>/dev/null

Clear Python bytecode cache

find . -type d -name "pycache" -exec rm -rf {} + 2>/dev/null find . -type f -name "*.pyc" -delete 2>/dev/null

Restart backend

sleep 3 && python -m src.main --command web > /tmp/backend_restart.log 2>&1 & sleep 10 # Wait for startup

Verify health

curl -m 5 http://localhost:8000/api/health

undefined

curl -m 5 http://localhost:8000/api/health

undefined

Phase 5: Verification

阶段5：验证

Verify the fix worked through multiple checks:

Health Check:

Call
```
/api/health
```
endpoint
Verify
```
"status": "healthy"
```
If still failing, check logs for new errors

Browser Verification:

Navigate to the affected UI page
Trigger the action that previously failed
Verify the error is gone
Check for new errors in console

Database Verification:

Query the affected records/tasks
Verify status changed from failed/error to success/completed
Check that metrics updated (e.g., scheduler shows "1 done" instead of "0 done")

Log Verification:

Check recent logs for the same error
Verify no new errors appeared
Look for success messages or "completed" status

Example:

Scheduler should show "1 done" instead of "0 done"
Task record should show status="completed" instead of "failed"
No error messages in logs
WebSocket shows healthy status in UI

通过多项检查验证修复是否生效：

健康检查：

调用
```
/api/health
```
端点
验证返回
```
"status": "healthy"
```
如果仍然失败，检查日志中的新错误

浏览器验证：

导航到受影响的UI页面
触发之前失败的操作
验证错误已消失
检查控制台是否有新错误

数据库验证：

查询受影响的记录/任务
验证状态从失败/错误变为成功/完成
检查指标是否更新（例如：调度器显示“1完成”而非“0完成”）

日志验证：

检查近期日志中是否存在相同错误
验证未出现新错误
查找成功消息或“已完成”状态

示例：

调度器应显示“1完成”而非“0完成”
任务记录应显示status="completed"而非"failed"
日志中无错误消息
UI中的WebSocket显示健康状态

Phase 6: Iteration

阶段6：迭代

If issues remain, repeat the cycle:

Continue if more issues exist:
- Check logs for remaining errors
- If yes, return to Phase 2 (Analysis)
- Fix the next issue (Phase 3)
- Restart (Phase 4)
- Verify (Phase 5)
Stop when all issues fixed:
- All schedulers show completed execution counts
- UI shows no error states
- Logs show no error patterns
- Tasks/records show success status
- Full verification complete

如果仍有问题，重复此循环：

若存在更多问题则继续：
- 检查日志中的剩余错误
- 如果有，返回阶段2（分析）
- 修复下一个问题（阶段3）
- 重启（阶段4）
- 验证（阶段5）
所有问题修复后停止：
- 所有调度器显示已完成的执行计数
- UI无错误状态
- 日志无错误模式
- 任务/记录显示成功状态
- 完成全面验证

Common Error Patterns

常见错误模式

See

references/common_errors.md

for patterns to recognize:

Python syntax errors (unterminated strings, missing quotes)

Import errors (

name 'X' is not defined

cannot import name 'Y'

)

Class/attribute errors (
```
'dict' object has no attribute 'symbol'
```
)
Type errors (passing wrong data type)
Payload/configuration errors (missing required fields)

请查看

references/common_errors.md

了解需识别的模式：

Python语法错误（未终止的字符串、缺失引号）

导入错误（

name 'X' is not defined

、

cannot import name 'Y'

）

类/属性错误（
```
'dict' object has no attribute 'symbol'
```
）
类型错误（传递错误的数据类型）
负载/配置错误（缺失必填字段）

Fix Templates

修复模板

See

references/fix_templates.md

for ready-to-use fix patterns:

How to add missing imports
How to fix class name mismatches
How to fix docstring syntax
How to add missing payload fields
How to fix type errors

请查看

references/fix_templates.md

获取现成可用的修复模式：

如何添加缺失的导入
如何修复类名不匹配问题
如何修复文档字符串语法
如何添加缺失的负载字段
如何修复类型错误

Tools Used

使用的工具

Playwright Browser Tools: Navigate UI, verify changes
Read/Grep Tools: Examine code and logs
Bash: Server restart, cache clearing, health checks
Edit Tool: Apply code fixes
Database Queries: Verify task/record state

Playwright Browser Tools： 导航UI、验证更改
Read/Grep Tools： 检查代码和日志
Bash： 服务器重启、缓存清理、健康检查
Edit Tool： 应用代码修复
Database Queries： 验证任务/记录状态

MCP Tools Integration

MCP工具集成

Use robo-trader-dev MCP tools for 95%+ token-efficient debugging:

Task	MCP Tool	Token Savings	Usage
Analyze error logs	`mcp__robo-trader-dev__analyze_logs`	98%	Pattern detection with time windows
System health check	`mcp__robo-trader-dev__check_system_health`	97%	Database, queues, API, disk status
Diagnose DB locks	`mcp__robo-trader-dev__diagnose_database_locks`	95%	Correlate logs with code patterns
Queue monitoring	`mcp__robo-trader-dev__queue_status`	96%	Real-time queue backlog analysis
Coordinator status	`mcp__robo-trader-dev__coordinator_status`	94%	Init status, error details
Error pattern fix	`mcp__robo-trader-dev__suggest_fix`	90%	Known pattern matching with examples
Read code files	`mcp__robo-trader-dev__smart_file_read`	85%	Progressive context (summary/targeted/full)
Find related files	`mcp__robo-trader-dev__find_related_files`	88%	Import/git/similarity analysis

Example debugging workflow:

python

undefined

使用robo-trader-dev MCP工具实现95%+的令牌高效调试：

任务	MCP工具	令牌节省率	使用场景
分析错误日志	`mcp__robo-trader-dev__analyze_logs`	98%	带时间窗口的模式检测
系统健康检查	`mcp__robo-trader-dev__check_system_health`	97%	数据库、队列、API、磁盘状态
诊断数据库锁	`mcp__robo-trader-dev__diagnose_database_locks`	95%	将日志与代码模式关联
队列监控	`mcp__robo-trader-dev__queue_status`	96%	实时队列积压分析
协调器状态	`mcp__robo-trader-dev__coordinator_status`	94%	初始化状态、错误详情
错误模式修复	`mcp__robo-trader-dev__suggest_fix`	90%	已知模式匹配及示例
读取代码文件	`mcp__robo-trader-dev__smart_file_read`	85%	渐进式上下文（摘要/定向/完整）
查找相关文件	`mcp__robo-trader-dev__find_related_files`	88%	导入/git/相似度分析

示例调试工作流：

python

undefined

1. Detect errors (MCP instead of tail/grep)

mcp__robo-trader-dev__analyze_logs(patterns=["ERROR", "TIMEOUT"], time_window="1h")

2. Check system health (MCP instead of curl loops)

mcp__robo-trader-dev__check_system_health(components=["database", "queues", "api_endpoints"])

3. Diagnose specific issue (MCP instead of sqlite3 + code reading)

mcp__robo-trader-dev__diagnose_database_locks(time_window="24h", include_code_references=True)

4. Get fix suggestions (MCP instead of manual pattern matching)

mcp__robo-trader-dev__suggest_fix(error_message="name 'Optional' is not defined", context_file="src/services/analyzer.py")


**Integration with robo-trader architecture**:
- Queue operations: Use `queue_status` to monitor PORTFOLIO_SYNC, DATA_FETCHER, AI_ANALYSIS
- Coordinator debugging: Use `coordinator_status` for BroadcastCoordinator, AIChatCoordinator init issues
- Database access: Use `query_portfolio` or `diagnose_database_locks` instead of direct sqlite3 connections

mcp__robo-trader-dev__suggest_fix(error_message="name 'Optional' is not defined", context_file="src/services/analyzer.py")


**与robo-trader架构的集成**：
- 队列操作：使用 `queue_status` 监控PORTFOLIO_SYNC、DATA_FETCHER、AI_ANALYSIS
- 协调器调试：使用 `coordinator_status` 排查BroadcastCoordinator、AIChatCoordinator初始化问题
- 数据库访问：使用 `query_portfolio` 或 `diagnose_database_locks` 替代直接sqlite3连接

Key Principles

核心原则

One issue at a time - Fix one problem per iteration to prevent cascading failures
Verify immediately - Always restart and verify after each fix
Multi-layer detection - Check UI, logs, and database for clues
Iterative refinement - Continue until all issues resolved
Automated restart - Always use clean restart (kill + cache clear + restart)
Browser verification - Always test in actual UI, not just logs

一次解决一个问题 - 每次迭代仅修复一个问题，防止级联故障
立即验证 - 每次修复后务必重启并验证
多层检测 - 从UI、日志和数据库中查找线索
迭代优化 - 持续操作直到所有问题解决
自动化重启 - 始终使用干净重启（终止进程 + 清理缓存 + 重启）
浏览器验证 - 始终在实际UI中测试，而非仅依赖日志