optimizing-queries
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseQuery Optimization
查询优化
Analyze database queries and provide actionable optimization recommendations.
分析数据库查询并提供可执行的优化建议。
Workflow
工作流程
Copy and track progress:
Query Optimization:
- [ ] Identify database type
- [ ] Analyze query structure
- [ ] Check for red flags
- [ ] Provide recommendations
- [ ] Suggest verification- Identify database type from syntax or user context
- Analyze query structure - parse SELECT, FROM, WHERE, JOIN, ORDER BY, GROUP BY clauses
- Check for red flags (see Quick Reference)
- Provide concrete recommendations with copy-pasteable code
- Suggest verification - EXPLAIN command or metric to confirm
For complex queries (3+ JOINs or nested subqueries):
- Decompose into logical units
- Analyze each unit's access pattern
- Synthesize recommendations in priority order
复制并跟踪进度:
Query Optimization:
- [ ] 识别数据库类型
- [ ] 分析查询结构
- [ ] 检查风险点
- [ ] 提供优化建议
- [ ] 建议验证方式- 识别数据库类型:根据语法或用户上下文判断
- 分析查询结构 - 解析SELECT、FROM、WHERE、JOIN、ORDER BY、GROUP BY子句
- 检查风险点(参考快速指南)
- 提供具体优化建议,附带可直接复制粘贴的代码
- 建议验证方式 - 使用EXPLAIN命令或指标确认优化效果
针对复杂查询(包含3个及以上JOIN或嵌套子查询):
- 将查询分解为逻辑单元
- 分析每个单元的访问模式
- 按优先级整合优化建议
Analysis Checklist
分析检查清单
For each query, evaluate:
- Index usage: Missing or inefficient indexes
- Scan type: Full scans on large tables
- Join efficiency: Join order, type, missing conditions
- Filtering: WHERE placement, function usage on columns
- Projection: SELECT * vs specific columns
- Pagination: OFFSET vs keyset/cursor
- N+1 detection: Loop patterns in code context
针对每个查询,评估以下内容:
- 索引使用情况:缺失或低效的索引
- 扫描类型:对大表执行全表扫描
- 连接效率:连接顺序、类型,是否缺少连接条件
- 过滤逻辑:WHERE子句的位置、列上的函数使用
- 字段投影:使用SELECT *还是指定具体列
- 分页方式:OFFSET vs 键集/游标分页
- N+1问题检测:代码上下文中的循环查询模式
Quick Reference
快速指南
SQL Red Flags
SQL风险点
- on large tables
SELECT * - prevents index use
WHERE function(column) - (leading wildcard)
LIKE '%pattern' - conditions spanning different columns
OR - Large values
OFFSET - with nullable subquery
NOT IN - Missing on unbounded queries
LIMIT
- 对大表使用
SELECT * - 会导致索引失效
WHERE function(column) - 使用(前导通配符)
LIKE '%pattern' - 跨不同列的条件
OR - 大的值
OFFSET - 子查询包含可空值时使用
NOT IN - 无界查询缺少
LIMIT
NoSQL Red Flags
NoSQL风险点
- MongoDB: COLLSCAN, , unbounded arrays
$where - Redis: , missing TTL, wrong data structure
KEYS * - DynamoDB: Scan operations, hot partitions
- Elasticsearch: Deep pagination, wildcard prefix
- MongoDB:COLLSCAN、、无界数组
$where - Redis:、缺少TTL、使用错误的数据结构
KEYS * - DynamoDB:扫描操作、热点分区
- Elasticsearch:深度分页、前缀通配符
Examples
示例
Example 1: Missing Index
Input:
sql
SELECT * FROM orders WHERE customer_email = 'user@example.com';Output:
undefined示例1:缺失索引
输入:
sql
SELECT * FROM orders WHERE customer_email = 'user@example.com';输出:
undefinedIssue [HIGH]
问题 [高优先级]
Full table scan on - no index on
orderscustomer_email对表执行全表扫描 - 列无索引
orderscustomer_emailImpact
影响
Query time grows linearly with table size; causes lock contention
查询时间随表大小线性增长;会导致锁竞争
Solution
解决方案
CREATE INDEX idx_orders_customer_email ON orders(customer_email);
CREATE INDEX idx_orders_customer_email ON orders(customer_email);
Verification
验证方式
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_email = 'user@example.com';
-- Should show "Index Scan" instead of "Seq Scan"
**Example 2: N+1 Query Pattern**
Input:
```python
users = db.query("SELECT * FROM users")
for user in users:
orders = db.query(f"SELECT * FROM orders WHERE user_id = {user.id}")Output:
undefinedEXPLAIN ANALYZE SELECT * FROM orders WHERE customer_email = 'user@example.com';
-- 应显示"Index Scan"而非"Seq Scan"
**示例2:N+1查询模式**
输入:
```python
users = db.query("SELECT * FROM users")
for user in users:
orders = db.query(f"SELECT * FROM orders WHERE user_id = {user.id}")输出:
undefinedIssue [HIGH]
问题 [高优先级]
N+1 query pattern - executes 1 + N queries instead of 1
N+1查询模式 - 执行1+N次查询而非1次
Impact
影响
100 users = 101 queries; network latency multiplied
100个用户会产生101次查询;网络延迟被放大N倍
Solution
解决方案
SELECT u., o. FROM users u
LEFT JOIN orders o ON u.id = o.user_id;
SELECT u., o. FROM users u
LEFT JOIN orders o ON u.id = o.user_id;
Verification
验证方式
Check query count before/after; should reduce from N+1 to 1
**Example 3: When NOT to Optimize**
Input:
```sql
SELECT * FROM config WHERE key = 'app_version';Output:
undefined优化前后对比查询次数;应从N+1次减少到1次
**示例3:无需优化的场景**
输入:
```sql
SELECT * FROM config WHERE key = 'app_version';输出:
undefinedNo Issues Detected
未检测到问题
This query on a small lookup table (typically <100 rows) is acceptable.
Adding an index would add write overhead without meaningful read benefit.
该查询针对小型查找表(通常少于100行),性能可接受。
添加索引会增加写入开销,但无法带来明显的读取性能提升。
When to Reconsider
何时需要重新考虑
- If table grows beyond 10,000 rows
- If query is called >100 times/second
**Example 4: Incomplete Information**
Input:
```sql
SELECT * FROM large_table WHERE status = ?;Output:
undefined- 当表行数超过10000行时
- 当查询调用频率超过每秒100次时
**示例4:信息不完整**
输入:
```sql
SELECT * FROM large_table WHERE status = ?;输出:
undefinedIssue [MEDIUM]
问题 [中优先级]
Preliminary analysis without schema information
缺少表结构信息,无法完成初步分析
Potential Issues
潜在问题
- - may fetch unnecessary columns
SELECT * - column - if low cardinality (e.g., 5 values), index may not help
status
- - 可能获取不必要的列
SELECT * - 列 - 如果基数较低(如仅5个可选值),索引可能无法起到作用
status
Information Needed
需要补充的信息
- Table row count
- Cardinality of column
status - Existing indexes:
SHOW INDEX FROM large_table;
**Example 5: MongoDB COLLSCAN**
Input:
```javascript
db.orders.find({ customer_email: "user@example.com" })Output:
undefined- 表的行数
- 列的基数
status - 现有索引:
SHOW INDEX FROM large_table;
**示例5:MongoDB全集合扫描**
输入:
```javascript
db.orders.find({ customer_email: "user@example.com" })输出:
undefinedIssue [HIGH]
问题 [高优先级]
COLLSCAN on - no index on
orderscustomer_email对集合执行COLLSCAN - 字段无索引
orderscustomer_emailImpact
影响
Query scans entire collection; O(n) performance
查询扫描整个集合;性能为O(n)
Solution
解决方案
db.orders.createIndex({ customer_email: 1 })
db.orders.createIndex({ customer_email: 1 })
Verification
验证方式
db.orders.find({ customer_email: "..." }).explain("executionStats")
// Should show "stage": "IXSCAN" instead of "COLLSCAN"
undefineddb.orders.find({ customer_email: "..." }).explain("executionStats")
// 应显示"stage": "IXSCAN"而非"COLLSCAN"
undefinedResponse Format
响应格式
Structure recommendations as:
undefined优化建议请按以下结构呈现:
undefinedIssue [HIGH/MEDIUM/LOW]
问题 [高/中/低优先级]
[Specific problem identified]
[识别到的具体问题]
Impact
影响
[Performance/resource cost]
[性能或资源消耗影响]
Solution
解决方案
[Concrete fix with code]
[附带代码的具体修复方案]
Verification
验证方式
[EXPLAIN command or metric to confirm]
undefined[用于确认效果的EXPLAIN命令或指标]
undefinedConfidence Levels
置信度等级
When uncertainty exists, indicate confidence:
- HIGH: Pattern clearly matches known anti-pattern
- MEDIUM: Likely issue, depends on data distribution/schema
- LOW: Potential issue, needs EXPLAIN to confirm
当存在不确定性时,请标注置信度:
- 高:模式明显匹配已知的反模式
- 中:可能存在问题,取决于数据分布或表结构
- 低:潜在问题,需要EXPLAIN输出确认
Response Constraints
响应约束
- Max 5 recommendations per query - prioritize by impact
- Order by severity: HIGH → MEDIUM → LOW
- One primary fix per issue - mention alternatives briefly
- Code must be copy-pasteable - no placeholders like
<table_name> - Verification commands must be complete - include all flags
- 每个查询最多提供5条优化建议 - 按影响优先级排序
- 按严重程度排序:高 → 中 → 低
- 每个问题提供一个主要修复方案 - 简要提及替代方案
- 代码必须可直接复制粘贴 - 不允许使用这类占位符
<table_name> - 验证命令必须完整 - 包含所有必要的参数
Prioritization Matrix
优先级矩阵
| Issue Type | Impact | Priority |
|---|---|---|
| Missing JOIN index | Query blocks | P0 |
| Full table scan (>100k rows) | Slow response | P0 |
| N+1 pattern | Latency × N | P1 |
| SELECT * | Memory/bandwidth | P1 |
| Missing LIMIT | Resource exhaustion | P1 |
| Large OFFSET | Slow pagination | P2 |
| Suboptimal ORDER BY | Sort overhead | P2 |
| 问题类型 | 影响 | 优先级 |
|---|---|---|
| 缺失JOIN索引 | 查询阻塞 | P0 |
| 全表扫描(行数>100k) | 响应缓慢 | P0 |
| N+1模式 | 延迟×N | P1 |
| SELECT * | 内存/带宽消耗 | P1 |
| 缺少LIMIT | 资源耗尽 | P1 |
| 大OFFSET值 | 分页缓慢 | P2 |
| 非最优ORDER BY | 排序开销大 | P2 |
Reference Files
参考文档
Consult these for detailed patterns (use grep to find specific sections):
- - SQL optimization
references/sql-patterns.md- Grep: ,
## Index,## Query Rewrite,## Join## Aggregation
- Grep:
- - MongoDB, Redis, DynamoDB, Elasticsearch
references/nosql-patterns.md- Grep: ,
## MongoDB,## Redis,## DynamoDB## Elasticsearch
- Grep:
- - Execution plans
references/explain-analysis.md- Grep: ,
## PostgreSQL,## MySQL## MongoDB
- Grep:
- - Common patterns
references/common-scenarios.md- Grep: ,
## Slow pagination,## N+1,## Dashboard## Bulk
- Grep:
- - ORM fixes
references/orm-patterns.md- Grep: ,
### Django,### SQLAlchemy,### ActiveRecord### Prisma
- Grep:
如需详细模式,请参考以下文档(可使用grep查找特定章节):
- - SQL优化
references/sql-patterns.md- Grep关键词:,
## Index,## Query Rewrite,## Join## Aggregation
- Grep关键词:
- - MongoDB、Redis、DynamoDB、Elasticsearch优化
references/nosql-patterns.md- Grep关键词:,
## MongoDB,## Redis,## DynamoDB## Elasticsearch
- Grep关键词:
- - 执行计划分析
references/explain-analysis.md- Grep关键词:,
## PostgreSQL,## MySQL## MongoDB
- Grep关键词:
- - 常见场景
references/common-scenarios.md- Grep关键词:,
## Slow pagination,## N+1,## Dashboard## Bulk
- Grep关键词:
- - ORM修复方案
references/orm-patterns.md- Grep关键词:,
### Django,### SQLAlchemy,### ActiveRecord### Prisma
- Grep关键词:
Utility Scripts
实用脚本
All scripts use Python standard library only (no pip install required).
Run these for automated analysis:
所有脚本仅依赖Python标准库(无需通过pip安装额外依赖)。
运行以下脚本可实现自动化分析:
Query Analysis
查询分析
analyze_query.py - Detect anti-patterns and score complexity:
bash
python scripts/analyze_query.py "SELECT * FROM orders WHERE ..."
python scripts/analyze_query.py --file query.sql --jsonsuggest_index.py - Generate index recommendations:
bash
python scripts/suggest_index.py "SELECT ... FROM orders WHERE status = 'pending'"analyze_query.py - 检测反模式并评估查询复杂度:
bash
python scripts/analyze_query.py "SELECT * FROM orders WHERE ..."
python scripts/analyze_query.py --file query.sql --jsonsuggest_index.py - 生成索引建议:
bash
python scripts/suggest_index.py "SELECT ... FROM orders WHERE status = 'pending'"EXPLAIN & Logs
EXPLAIN与日志分析
parse_explain.py - Analyze EXPLAIN output:
bash
python scripts/parse_explain.py --pg explain_output.txt
python scripts/parse_explain.py --mysql explain_output.txtdiff_explain.py - Compare before/after EXPLAIN outputs:
bash
python scripts/diff_explain.py --pg before.txt after.txt
python scripts/diff_explain.py --mysql before.txt after.txt --jsonparse_slow_log.py - Analyze slow query logs:
bash
python scripts/parse_slow_log.py --pg /var/log/postgresql.log --top 20
python scripts/parse_slow_log.py --mysql slow-query.log --jsonparse_explain.py - 分析EXPLAIN输出:
bash
python scripts/parse_explain.py --pg explain_output.txt
python scripts/parse_explain.py --mysql explain_output.txtdiff_explain.py - 对比优化前后的EXPLAIN输出:
bash
python scripts/diff_explain.py --pg before.txt after.txt
python scripts/diff_explain.py --mysql before.txt after.txt --jsonparse_slow_log.py - 分析慢查询日志:
bash
python scripts/parse_slow_log.py --pg /var/log/postgresql.log --top 20
python scripts/parse_slow_log.py --mysql slow-query.log --jsonSchema & Code
表结构与代码分析
analyze_schema.py - Find schema optimization opportunities:
bash
python scripts/analyze_schema.py schema.sqldetect_orm_issues.py - Find ORM anti-patterns in code:
bash
python scripts/detect_orm_issues.py app.py
python scripts/detect_orm_issues.py --dir ./modelscheck_migration.py - Check migration safety:
bash
python scripts/check_migration.py --pg migration.sql
python scripts/check_migration.py --mysql migration.sqlanalyze_schema.py - 发现表结构优化机会:
bash
python scripts/analyze_schema.py schema.sqldetect_orm_issues.py - 检测代码中的ORM反模式:
bash
python scripts/detect_orm_issues.py app.py
python scripts/detect_orm_issues.py --dir ./modelscheck_migration.py - 检查迁移的安全性:
bash
python scripts/check_migration.py --pg migration.sql
python scripts/check_migration.py --mysql migration.sqlWorkflow with Scripts
脚本工作流程
For query optimization:
- → detect issues
analyze_query.py - → generate indexes
suggest_index.py - User runs EXPLAIN (before)
- User applies optimization
- User runs EXPLAIN (after)
- → verify improvement
diff_explain.py
For verifying optimization impact:
bash
undefined查询优化流程:
- 运行→ 检测问题
analyze_query.py - 运行→ 生成索引建议
suggest_index.py - 用户运行EXPLAIN(优化前)
- 用户应用优化方案
- 用户运行EXPLAIN(优化后)
- 运行→ 验证优化效果
diff_explain.py
验证优化影响的流程:
bash
undefinedSave before state
保存优化前状态
psql -c "EXPLAIN (ANALYZE, BUFFERS) SELECT ..." > before.txt
psql -c "EXPLAIN (ANALYZE, BUFFERS) SELECT ..." > before.txt
Apply optimization (add index, rewrite query, etc.)
应用优化方案(添加索引、重写查询等)
Save after state
保存优化后状态
psql -c "EXPLAIN (ANALYZE, BUFFERS) SELECT ..." > after.txt
psql -c "EXPLAIN (ANALYZE, BUFFERS) SELECT ..." > after.txt
Compare
对比结果
python scripts/diff_explain.py --pg before.txt after.txt
**For codebase audit:**
1. `detect_orm_issues.py --dir ./` → find N+1 patterns
2. `analyze_schema.py` → check schema
3. Prioritize fixes by severity
**For migration review:**
1. `check_migration.py` → safety check
2. Fix critical issues before deployment
3. Use safe alternatives (CONCURRENTLY, etc.)python scripts/diff_explain.py --pg before.txt after.txt
**代码库审计流程:**
1. 运行`detect_orm_issues.py --dir ./` → 发现N+1模式
2. 运行`analyze_schema.py` → 检查表结构
3. 按严重程度优先处理问题
**迁移审查流程:**
1. 运行`check_migration.py` → 安全性检查
2. 部署前修复严重问题
3. 使用安全的替代方案(如CONCURRENTLY等)EXPLAIN Commands
EXPLAIN命令
Provide appropriate commands:
sql
-- PostgreSQL
EXPLAIN (ANALYZE, BUFFERS) SELECT ...;
-- MySQL
EXPLAIN ANALYZE SELECT ...;
-- SQLite
EXPLAIN QUERY PLAN SELECT ...;javascript
// MongoDB
db.collection.find({...}).explain("executionStats")请使用对应数据库的EXPLAIN命令:
sql
-- PostgreSQL
EXPLAIN (ANALYZE, BUFFERS) SELECT ...;
-- MySQL
EXPLAIN ANALYZE SELECT ...;
-- SQLite
EXPLAIN QUERY PLAN SELECT ...;javascript
// MongoDB
db.collection.find({...}).explain("executionStats")Index Recommendation Format
索引建议格式
When suggesting indexes:
sql
-- PostgreSQL/MySQL
CREATE INDEX [CONCURRENTLY] idx_table_columns
ON table(col1, col2, col3);
-- Rationale: [why this column order, covering considerations]javascript
// MongoDB (ESR: Equality, Sort, Range)
db.collection.createIndex({ equality_field: 1, sort_field: -1, range_field: 1 })建议索引时请遵循以下格式:
sql
-- PostgreSQL/MySQL
CREATE INDEX [CONCURRENTLY] idx_table_columns
ON table(col1, col2, col3);
-- 理由:[说明列顺序的原因、覆盖索引的考虑]javascript
// MongoDB(ESR原则:匹配、排序、范围)
db.collection.createIndex({ equality_field: 1, sort_field: -1, range_field: 1 })Handling Missing Information
处理信息缺失的情况
When user provides incomplete context:
No schema provided:
"To give precise recommendations, I need the table schema. Could you share:
- Column data types
- Existing indexes
- Approximate row counts
Meanwhile, here's what I can infer from the query..."
No EXPLAIN output:
"Run this command and share the output:sqlEXPLAIN (ANALYZE, BUFFERS) <their query>;This will show actual vs estimated rows and scan types."
Unknown database type:
"Which database are you using? (PostgreSQL, MySQL, SQLite, MongoDB, etc.) The optimization strategies differ significantly between them."
当用户提供的上下文不完整时:
未提供表结构:
"为了给出精准的建议,我需要表结构信息。能否提供以下内容:
- 列的数据类型
- 现有索引
- 大致行数
同时,我可以根据查询本身给出一些推断性建议..."
未提供EXPLAIN输出:
"请运行以下命令并分享输出结果:sqlEXPLAIN (ANALYZE, BUFFERS) <你的查询>;该命令会显示实际行数与预估行数的对比,以及扫描类型。"
未知数据库类型:
"你使用的是哪种数据库?(PostgreSQL、MySQL、SQLite、MongoDB等) 不同数据库的优化策略差异很大。"
Conditional Workflows
条件工作流程
Analyzing a query:
- Run first
analyze_query.py - Check for anti-patterns
- Suggest fixes with examples
Interpreting EXPLAIN output:
- Run with appropriate flag
parse_explain.py - Identify warnings
- Correlate with query structure
Full optimization request:
- Analyze query → identify issues
- Suggest indexes → generate CREATE statements
- Request EXPLAIN → interpret results
- Provide before/after comparison
Complex optimization (3+ issues or architectural decisions):
Use extended reasoning to:
- Map full query execution path
- Identify all optimization opportunities
- Evaluate trade-offs between approaches
- Prioritize by impact and implementation cost
分析查询时:
- 先运行
analyze_query.py - 检查反模式
- 提供带示例的修复建议
解读EXPLAIN输出时:
- 使用对应标志运行
parse_explain.py - 识别警告信息
- 关联查询结构进行分析
完整优化请求:
- 分析查询 → 识别问题
- 建议索引 → 生成CREATE语句
- 请求EXPLAIN输出 → 解读结果
- 提供优化前后的对比
复杂优化(包含3个及以上问题或架构决策):
使用扩展推理:
- 映射完整的查询执行路径
- 识别所有优化机会
- 评估不同方案的权衡
- 按影响和实施成本排序优先级
Script Error Handling
脚本错误处理
If scripts fail:
- Check Python version (requires 3.7+)
- Verify input format (SQL string, valid file path)
- Fall back to manual analysis using reference files
Common errors:
- → Check file path, use absolute paths
FileNotFoundError - → Input not valid JSON, check --pg/--mysql flags
json.JSONDecodeError - Empty output → Query may be valid, no issues detected
When scripts are unavailable:
Perform manual analysis using Quick Reference red flags and reference files.
如果脚本运行失败:
- 检查Python版本(需要3.7及以上)
- 验证输入格式(SQL字符串、有效的文件路径)
- fallback到使用参考文档进行手动分析
常见错误:
- → 检查文件路径,使用绝对路径
FileNotFoundError - → 输入不是有效的JSON,检查--pg/--mysql标志
json.JSONDecodeError - 输出为空 → 查询可能是有效的,未检测到问题
当脚本不可用时:
使用快速指南中的风险点和参考文档进行手动分析。
Unsupported Databases
不支持的数据库
For databases not explicitly covered (Oracle, SQL Server, CockroachDB, etc.):
- Apply general SQL patterns from
references/sql-patterns.md - Note: "Optimization patterns based on PostgreSQL/MySQL; verify syntax for [database]"
- Recommend user consult database-specific documentation for EXPLAIN syntax and index creation
对于未明确覆盖的数据库(Oracle、SQL Server、CockroachDB等):
- 应用中的通用SQL模式
references/sql-patterns.md - 备注:"优化模式基于PostgreSQL/MySQL;请针对[数据库]验证语法"
- 建议用户参考数据库官方文档获取EXPLAIN语法和索引创建方法