sql-patterns
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSQL Patterns
SQL 模式
When to Use
适用场景
When the user asks to:
- Write or optimize SQL queries (joins, subqueries, CTEs, window functions)
- Design or review a database schema
- Analyze query performance using EXPLAIN plans
- Choose indexing strategies for a workload
- Fix slow queries or identify anti-patterns
- Implement pagination, deduplication, or analytical queries
当用户要求执行以下操作时使用:
- 编写或优化SQL查询(连接、子查询、CTE、窗口函数)
- 设计或审核数据库 schema
- 使用 EXPLAIN 执行计划分析查询性能
- 为工作负载选择索引策略
- 修复慢查询或识别反模式
- 实现分页、去重或分析类查询
Instructions
使用指南
1. Query Construction
1. 查询构建
Build queries incrementally, starting with the core logic before adding complexity.
Joins - choose the right type for the data relationship:
- — only matching rows in both tables (default, most common)
INNER JOIN - — all rows from left table, NULLs where right has no match
LEFT JOIN - — all rows from right table (rare; rewrite as LEFT JOIN for clarity)
RIGHT JOIN - — all rows from both, NULLs on the unmatched side
FULL OUTER JOIN - — cartesian product (use intentionally for generating combinations)
CROSS JOIN
CTEs (WITH clauses) — prefer CTEs over nested subqueries for readability:
sql
WITH active_users AS (
SELECT user_id, MAX(login_at) AS last_login
FROM logins
WHERE login_at > CURRENT_DATE - INTERVAL '30 days'
GROUP BY user_id
),
user_orders AS (
SELECT user_id, COUNT(*) AS order_count, SUM(total) AS revenue
FROM orders
GROUP BY user_id
)
SELECT u.name, au.last_login, uo.order_count, uo.revenue
FROM users u
JOIN active_users au USING (user_id)
LEFT JOIN user_orders uo USING (user_id);Window functions — use for rankings, running totals, and row comparisons:
sql
SELECT
department,
employee,
salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank,
salary - LAG(salary) OVER (PARTITION BY department ORDER BY salary) AS diff_from_prev,
SUM(salary) OVER (PARTITION BY department) AS dept_total
FROM employees;Aggregations — remember the evaluation order: filters rows, groups them, filters groups, window functions run after .
WHEREGROUP BYHAVINGHAVING增量构建查询,先实现核心逻辑再增加复杂度。
连接(Joins) - 选择适合数据关系的连接类型:
- — 仅返回两张表中匹配的行(默认选项,最常用)
INNER JOIN - — 返回左表的所有行,右表无匹配的位置返回NULL
LEFT JOIN - — 返回右表的所有行(很少使用,为了可读性建议重写为LEFT JOIN)
RIGHT JOIN - — 返回两张表的所有行,未匹配的一侧返回NULL
FULL OUTER JOIN - — 笛卡尔积(仅在有意生成组合场景下使用)
CROSS JOIN
CTE(WITH 子句) — 相比嵌套子查询,优先使用CTE提升可读性:
sql
WITH active_users AS (
SELECT user_id, MAX(login_at) AS last_login
FROM logins
WHERE login_at > CURRENT_DATE - INTERVAL '30 days'
GROUP BY user_id
),
user_orders AS (
SELECT user_id, COUNT(*) AS order_count, SUM(total) AS revenue
FROM orders
GROUP BY user_id
)
SELECT u.name, au.last_login, uo.order_count, uo.revenue
FROM users u
JOIN active_users au USING (user_id)
LEFT JOIN user_orders uo USING (user_id);窗口函数(Window functions) — 用于排名、滚动求和以及行对比:
sql
SELECT
department,
employee,
salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank,
salary - LAG(salary) OVER (PARTITION BY department ORDER BY salary) AS diff_from_prev,
SUM(salary) OVER (PARTITION BY department) AS dept_total
FROM employees;聚合操作 — 记住执行顺序: 过滤行, 分组, 过滤分组,窗口函数在之后执行。
WHEREGROUP BYHAVINGHAVING2. Query Optimization
2. 查询优化
When optimizing, follow this process:
- Read the EXPLAIN plan — look for sequential scans on large tables, nested loops on large result sets, sort operations without indexes
- Check indexes — ensure WHERE, JOIN, and ORDER BY columns are indexed
- Reduce data early — filter in WHERE before joining; join on indexed columns
- Avoid anti-patterns:
- in production queries (fetches unnecessary columns)
SELECT * - Functions on indexed columns in WHERE (defeats the index; use range instead)
WHERE YEAR(created_at) = 2025 - with NULLable columns (use
NOT INinstead)NOT EXISTS - Correlated subqueries that execute per-row (rewrite as JOIN or CTE)
- Missing LIMIT on exploratory queries
- Implicit type conversions that prevent index use
优化时遵循以下流程:
- 阅读EXPLAIN执行计划 — 查找大表的全表扫描、大结果集上的嵌套循环、无索引的排序操作
- 检查索引 — 确保WHERE、JOIN和ORDER BY涉及的列都已建立索引
- 尽早缩减数据量 — 连接前先通过WHERE过滤;使用有索引的列做连接条件
- 避免反模式:
- 生产环境查询使用(会拉取不必要的列)
SELECT * - WHERE子句中对索引列使用函数(会导致索引失效,改用范围查询替代)
WHERE YEAR(created_at) = 2025 - 对可空列使用(改用
NOT IN替代)NOT EXISTS - 每行都会执行的关联子查询(重写为JOIN或CTE)
- 探索性查询缺少LIMIT限制
- 导致索引失效的隐式类型转换
- 生产环境查询使用
3. Indexing Strategy
3. 索引策略
Choose index types based on query patterns:
- B-tree (default) — equality, range, ORDER BY, LIKE 'prefix%'
- Hash — equality only, smaller than B-tree (Postgres-specific)
- GIN — full-text search, JSONB containment, array operations
- GiST — geometric data, range types, nearest-neighbor
- Composite indexes — column order matters; leftmost prefix rule applies
- Partial indexes — for queries that always filter on that condition
WHERE is_active = true - Covering indexes — columns to enable index-only scans
INCLUDE
根据查询模式选择索引类型:
- B-tree(默认) — 等值、范围、ORDER BY、LIKE '前缀%'场景
- Hash — 仅支持等值查询,比B-tree占用空间更小(Postgres特有)
- GIN — 全文搜索、JSONB包含查询、数组操作
- GiST — 几何数据、范围类型、最近邻查询
- 复合索引 — 列顺序很重要,遵循最左前缀原则
- 部分索引 — 针对总是筛选特定条件的查询,例如
WHERE is_active = true - 覆盖索引 — 通过列实现仅索引扫描
INCLUDE
4. Transactions and Concurrency
4. 事务和并发
- Use explicit transactions for multi-statement operations
- Choose the right isolation level: READ COMMITTED (default, good for most), REPEATABLE READ (consistent snapshots), SERIALIZABLE (full isolation, retry on conflict)
- Keep transactions short — avoid user interaction inside a transaction
- Use when reading rows you intend to modify
SELECT ... FOR UPDATE - Handle deadlocks with retry logic, not just longer timeouts
- 多语句操作使用显式事务
- 选择合适的隔离级别:READ COMMITTED(默认,适合大多数场景)、REPEATABLE READ(一致性快照)、SERIALIZABLE(完全隔离,冲突时需重试)
- 保持事务简短 —— 避免在事务内部进行用户交互
- 当你读取的行后续需要修改时使用
SELECT ... FOR UPDATE - 通过重试逻辑处理死锁,而不是仅延长超时时间
5. Schema Design Basics
5. Schema设计基础
Refer to for detailed patterns. Key principles:
references/schema-design.md- Normalize to 3NF by default, denormalize intentionally with a documented reason
- Use surrogate keys (auto-increment or UUID) as primary keys
- Add and
created_attimestamps to all tablesupdated_at - Use foreign keys to enforce referential integrity
- Name constraints explicitly for readable error messages
See for common query recipes.
references/query-patterns.md详细模式参考,核心原则:
references/schema-design.md- 默认规范化到3NF,有意进行反规范化时需要记录原因
- 使用代理键(自增或UUID)作为主键
- 为所有表添加和
created_at时间戳updated_at - 使用外键强制参照完整性
- 显式命名约束以获得可读性更强的错误信息
常见查询示例参考。
references/query-patterns.mdExamples
示例
User: "Write a query to find the top 3 products per category by revenue"
Agent: Uses a CTE with window function partitioned by category, ordered by revenue descending, then filters to . Includes the JOIN to get category and product names, and suggests an index on .
ROW_NUMBER()row_num <= 3(category_id, revenue DESC)User: "This query is slow, can you optimize it?"
Agent: Runs on the query, identifies a sequential scan on a 2M-row table due to a function call on an indexed column in the WHERE clause. Rewrites the condition to use a range comparison, confirms the index is now used, and shows the before/after execution times.
EXPLAIN ANALYZEUser: "How should I index this table for our read patterns?"
Agent: Analyzes the common query patterns (filter by status + date range, search by email, sort by created_at), recommends a composite index , a unique index on , and a partial index for since 80% of queries filter on that. Explains the trade-off with write performance.
(status, created_at)emailWHERE status = 'pending'用户: "编写一个查询,找出每个类别中收入排名前三的产品"
Agent: 使用包含按类别分区、按收入降序排序的窗口函数的CTE,然后筛选的结果。包含获取类别和产品名称的JOIN语句,并建议建立索引。
ROW_NUMBER()row_num <= 3(category_id, revenue DESC)用户: "这个查询很慢,你能优化它吗?"
Agent: 对查询执行,识别出因为WHERE子句中对索引列使用了函数调用,导致200万行的表发生全表扫描。重写条件使用范围比较,确认索引已正常使用,并展示优化前后的执行时间。
EXPLAIN ANALYZE用户: "针对我们的读模式,我该如何为这张表建索引?"
Agent: 分析常见查询模式(按状态+日期范围过滤、按邮箱搜索、按created_at排序),推荐建立复合索引、email上的唯一索引,以及的部分索引,因为80%的查询都会筛选这个条件。同时说明该方案对写入性能的权衡。
(status, created_at)WHERE status = 'pending'