database-performance-debugging

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Database Performance Debugging

数据库性能调试

Overview

概述

Database performance issues directly impact application responsiveness. Debugging focuses on identifying slow queries and optimizing execution plans.

数据库性能问题会直接影响应用程序的响应速度。调试工作的重点是识别慢查询并优化执行计划。

When to Use

适用场景

Slow application response times
High database CPU
Slow queries identified
Performance regression
Under load stress

应用程序响应缓慢
数据库CPU占用率高
检测到慢查询
性能出现退化
处于负载压力下

Instructions

操作步骤

1. Identify Slow Queries

1. 识别慢查询

sql

-- Enable slow query log (MySQL)
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 0.5;

-- View slow queries
SHOW GLOBAL STATUS LIKE 'Slow_queries';
SELECT * FROM mysql.slow_log;

-- PostgreSQL slow queries
CREATE EXTENSION pg_stat_statements;
SELECT mean_exec_time, calls, query
FROM pg_stat_statements
ORDER BY mean_exec_time DESC LIMIT 10;

-- SQL Server slow queries
SELECT TOP 10
  execution_count,
  total_elapsed_time,
  statement_text
FROM sys.dm_exec_query_stats
ORDER BY total_elapsed_time DESC;

-- Query profiling
EXPLAIN ANALYZE
SELECT * FROM orders WHERE user_id = 123;
-- Slow: Seq Scan (full table scan)
-- Fast: Index Scan

sql

-- Enable slow query log (MySQL)
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 0.5;

-- View slow queries
SHOW GLOBAL STATUS LIKE 'Slow_queries';
SELECT * FROM mysql.slow_log;

-- PostgreSQL slow queries
CREATE EXTENSION pg_stat_statements;
SELECT mean_exec_time, calls, query
FROM pg_stat_statements
ORDER BY mean_exec_time DESC LIMIT 10;

-- SQL Server slow queries
SELECT TOP 10
  execution_count,
  total_elapsed_time,
  statement_text
FROM sys.dm_exec_query_stats
ORDER BY total_elapsed_time DESC;

-- Query profiling
EXPLAIN ANALYZE
SELECT * FROM orders WHERE user_id = 123;
-- 慢查询：Seq Scan（全表扫描）
-- 快查询：Index Scan

2. Common Issues & Solutions

2. 常见问题与解决方案

yaml

Issue: N+1 Query Problem

Symptom: 1001 queries for 1000 records

Example (Python):
  for user in users:
    posts = db.query(Post).filter(Post.user_id == user.id)
    # 1 + 1000 queries

Solution:
  users = db.query(User).options(joinedload(User.posts))
  # Single query with JOIN

---

Issue: Missing Index

Symptom: Seq Scan instead of Index Scan

Solution:
  CREATE INDEX idx_orders_user_id ON orders(user_id);
  Verify: EXPLAIN ANALYZE shows Index Scan now

---

Issue: Inefficient JOIN

Before:
  SELECT * FROM orders o, users u
  WHERE o.user_id = u.id AND u.email LIKE '%@example.com'
  # Bad: Table scan on users for every order

After:
  SELECT o.* FROM orders o
  JOIN users u ON o.user_id = u.id
  WHERE u.email = 'exact@example.com'
  # Good: Single email lookup

---

Issue: Large Table Scan

Symptom: SELECT * FROM large_table (1M rows)

Solutions:
  1. Add LIMIT clause
  2. Add WHERE condition
  3. Select specific columns
  4. Use pagination
  5. Archive old data

---

Issue: Slow Aggregation

Before (1 minute):
  SELECT user_id, COUNT(*), SUM(amount)
  FROM transactions
  GROUP BY user_id

After (50ms):
  SELECT user_id, transaction_count, total_amount
  FROM user_transaction_stats
  WHERE updated_at > NOW() - INTERVAL 1 DAY
  # Materialized view or aggregation table

yaml

问题：N+1查询问题

症状：获取1000条数据时执行1001次查询

示例（Python）：
  for user in users:
    posts = db.query(Post).filter(Post.user_id == user.id)
    # 1 + 1000次查询

解决方案：
  users = db.query(User).options(joinedload(User.posts))
  # 单次JOIN查询

---

问题：缺少索引

症状：使用Seq Scan而非Index Scan

解决方案：
  CREATE INDEX idx_orders_user_id ON orders(user_id);
  验证：EXPLAIN ANALYZE显示当前使用Index Scan

---

问题：低效JOIN

优化前：
  SELECT * FROM orders o, users u
  WHERE o.user_id = u.id AND u.email LIKE '%@example.com'
  # 不佳：对每个订单都扫描用户表

优化后：
  SELECT o.* FROM orders o
  JOIN users u ON o.user_id = u.id
  WHERE u.email = 'exact@example.com'
  # 良好：单次邮箱查找

---

问题：大表扫描

症状：SELECT * FROM large_table（100万行数据）

解决方案：
  1. 添加LIMIT子句
  2. 添加WHERE条件
  3. 选择特定列
  4. 使用分页
  5. 归档旧数据

---

问题：聚合查询缓慢

优化前（耗时1分钟）：
  SELECT user_id, COUNT(*), SUM(amount)
  FROM transactions
  GROUP BY user_id

优化后（耗时50毫秒）：
  SELECT user_id, transaction_count, total_amount
  FROM user_transaction_stats
  WHERE updated_at > NOW() - INTERVAL 1 DAY
  # 物化视图或聚合表

3. Execution Plan Analysis

3. 执行计划分析

yaml

EXPLAIN Output Understanding:

Seq Scan (Full Table Scan):
  - Reads entire table
  - Slowest method
  - Fix: Add index

Index Scan:
  - Uses index
  - Fast
  - Ideal

Bitmap Index Scan:
  - Partial index scan
  - Converts to heap scan
  - Moderate speed

Nested Loop:
  - For each row in left, scan right
  - O(n*m) complexity
  - Slow for large tables

Hash Join:
  - Build hash table of smaller table
  - Probe with larger table
  - Faster than nested loop

Merge Join:
  - Sort both tables, merge
  - Fastest for large sorted data
  - Requires sort operation

---

Reading EXPLAIN ANALYZE:

Node: Seq Scan on orders (actual 8023.456 ms)
  - Seq Scan = Full table scan
  - actual time = real execution time
  - 8023 ms = TOO SLOW

Rows: 1000000 (estimated) 1000000 (actual)
  - Match = planner accurate
  - Mismatch = update statistics

Node: Index Scan (actual 15.234 ms)
  - Index Scan = Fast
  - 15 ms = ACCEPTABLE

yaml

EXPLAIN输出解读：

Seq Scan（全表扫描）：
  - 读取整个表
  - 最慢的查询方式
  - 解决方法：添加索引

Index Scan（索引扫描）：
  - 使用索引
  - 查询速度快
  - 理想方式

Bitmap Index Scan（位图索引扫描）：
  - 部分索引扫描
  - 转换为堆扫描
  - 速度中等

Nested Loop（嵌套循环）：
  - 对左表的每一行扫描右表
  - 时间复杂度O(n*m)
  - 大表查询速度慢

Hash Join（哈希连接）：
  - 为小表构建哈希表
  - 用大表进行探测
  - 比嵌套循环更快

Merge Join（合并连接）：
  - 对两个表排序后合并
  - 对大型有序数据查询最快
  - 需要排序操作

---

解读EXPLAIN ANALYZE：

节点：Seq Scan on orders（实际耗时8023.456毫秒）
  - Seq Scan = 全表扫描
  - actual time = 实际执行时间
  - 8023毫秒 = 过慢

行数：1000000（预估）1000000（实际）
  - 匹配 = 规划器预估准确
  - 不匹配 = 更新统计信息

节点：Index Scan（实际耗时15.234毫秒）
  - Index Scan = 速度快
  - 15毫秒 = 可接受

4. Debugging Process

4. 调试流程

yaml

Steps:

1. Identify Slow Query
  - Enable slow query logging
  - Run workload
  - Review slow log
  - Note execution time

2. Analyze with EXPLAIN
  - Run EXPLAIN ANALYZE
  - Look for Seq Scan
  - Check estimated vs actual rows
  - Review join methods

3. Find Root Cause
  - Missing index?
  - Inefficient join?
  - Missing WHERE clause?
  - Outdated statistics?

4. Try Fix
  - Add index
  - Rewrite query
  - Update statistics
  - Archive old data

5. Measure Improvement
  - Run query after fix
  - Compare execution time
  - Before: 5000ms
  - After: 100ms (50x faster!)

6. Monitor
  - Track slow queries
  - Set baseline
  - Alert on regression
  - Periodic review

---

Checklist:

[ ] Slow query identified and logged
[ ] EXPLAIN ANALYZE run
[ ] Estimated vs actual rows analyzed
[ ] Seq Scans identified
[ ] Indexes checked
[ ] Join strategy reviewed
[ ] Statistics updated
[ ] Query rewritten if needed
[ ] Index created if needed
[ ] Fix verified
[ ] Performance baseline established
[ ] Monitoring configured
[ ] Documented for team

yaml

步骤：

1. 识别慢查询
  - 启用慢查询日志
  - 运行工作负载
  - 查看慢查询日志
  - 记录执行时间

2. 使用EXPLAIN分析
  - 运行EXPLAIN ANALYZE
  - 查找Seq Scan
  - 分析预估行数与实际行数
  - 审查连接方式

3. 定位根本原因
  - 是否缺少索引？
  - JOIN是否低效？
  - 是否缺少WHERE子句？
  - 统计信息是否过时？

4. 尝试修复
  - 添加索引
  - 重写查询语句
  - 更新统计信息
  - 归档旧数据

5. 衡量优化效果
  - 运行优化后的查询
  - 对比执行时间
  - 优化前：5000毫秒
  - 优化后：100毫秒（提速50倍！）

6. 监控
  - 跟踪慢查询
  - 设置基准线
  - 性能退化时触发告警
  - 定期审查

---

检查清单：

[ ] 已识别并记录慢查询
[ ] 已运行EXPLAIN ANALYZE
[ ] 已分析预估行数与实际行数
[ ] 已识别Seq Scan
[ ] 已检查索引
[ ] 已审查连接策略
[ ] 已更新统计信息
[ ] 按需重写了查询语句
[ ] 按需创建了索引
[ ] 已验证修复效果
[ ] 已建立性能基准线
[ ] 已配置监控
[ ] 已为团队完成文档记录

Key Points

关键要点

Enable slow query logging in production
Use EXPLAIN ANALYZE to investigate
Look for Seq Scan = missing index
Add indexes to WHERE/JOIN columns
Monitor query statistics
Update table statistics regularly
Rewrite queries to avoid inefficiencies
Use pagination for large result sets
Measure before and after optimization
Track slow query trends

在生产环境中启用慢查询日志
使用EXPLAIN ANALYZE进行调查
出现Seq Scan通常意味着缺少索引
为WHERE/JOIN列添加索引
监控查询统计信息
定期更新表统计信息
重写查询语句以避免低效操作
对大型结果集使用分页
优化前后都要进行性能测量
跟踪慢查询趋势