data-sql-optimization
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSQL Optimization — Comprehensive Reference
SQL优化——综合参考指南
This skill provides actionable checklists, patterns, and templates for transactional (OLTP) SQL optimization: measurement-first triage, EXPLAIN/plan interpretation, balanced indexing (avoiding over-indexing), performance monitoring, schema evolution, migrations, backup/recovery, high availability, and security.
Supported Platforms: PostgreSQL, MySQL, SQL Server, Oracle, SQLite
For OLAP/Analytics: See data-lake-platform (ClickHouse, DuckDB, Doris, StarRocks)
本技能为事务型(OLTP)SQL优化提供可落地的检查清单、模式及模板:基于度量的优先级划分、EXPLAIN/执行计划解读、均衡索引设计(避免过度索引)、性能监控、Schema演进、数据库迁移、备份/恢复、高可用配置及安全防护。
支持的数据库平台: PostgreSQL、MySQL、SQL Server、Oracle、SQLite
面向OLAP/分析型场景: 请查看data-lake-platform(ClickHouse、DuckDB、Doris、StarRocks)
Quick Reference
快速参考
| Task | Tool/Framework | Command | When to Use |
|---|---|---|---|
| Query Performance Analysis | EXPLAIN ANALYZE | | Diagnose slow queries, identify missing indexes |
| Find Slow Queries | pg_stat_statements / slow query log | | Identify performance bottlenecks in production |
| Index Analysis | pg_stat_user_indexes / SHOW INDEX | | Find unused indexes, validate index coverage |
| Schema Migration | Flyway / Liquibase | | Version-controlled database changes |
| Backup & Recovery | pg_dump / mysqldump | | Point-in-time recovery, disaster recovery |
| Replication Setup | Streaming / GTID | Configure postgresql.conf / my.cnf | High availability, read scaling |
| Safe Tuning Loop | Measure -> Explain -> Change -> Verify | Use tuning worksheet template | Reduce latency/cost without regressions |
| 任务 | 工具/框架 | 命令 | 适用场景 |
|---|---|---|---|
| 查询性能分析 | EXPLAIN ANALYZE | | 诊断慢查询,识别缺失的索引 |
| 定位慢查询 | pg_stat_statements / 慢查询日志 | | 识别生产环境中的性能瓶颈 |
| 索引分析 | pg_stat_user_indexes / SHOW INDEX | | 发现未使用的索引,验证索引覆盖情况 |
| Schema迁移 | Flyway / Liquibase | | 版本化控制的数据库变更 |
| 备份与恢复 | pg_dump / mysqldump | | 时点恢复、灾难恢复 |
| 复制配置 | 流复制 / GTID | 配置postgresql.conf / my.cnf | 高可用配置、读扩展 |
| 安全调优循环 | 度量→解读→变更→验证 | 使用调优工作表模板 | 在不引入回归问题的前提下降低延迟/成本 |
Decision Tree: Choosing the Right Approach
决策树:选择合适的优化方案
text
Query performance issue?
├─ Identify slow queries first?
│ ├─ PostgreSQL -> pg_stat_statements (top queries by total_exec_time)
│ └─ MySQL -> Performance Schema / slow query log
│
├─ Analyze execution plan?
│ ├─ PostgreSQL -> EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
│ ├─ MySQL -> EXPLAIN FORMAT=JSON or EXPLAIN ANALYZE
│ └─ SQL Server -> SET STATISTICS IO ON; SET STATISTICS TIME ON;
│
├─ Need indexing strategy?
│ ├─ PostgreSQL -> B-tree (default), GIN (JSONB), GiST (spatial), partial indexes
│ ├─ MySQL -> BTREE (default), FULLTEXT (text search), SPATIAL
│ └─ Check: Table >10k rows AND selectivity <10% AND 10x+ speedup verified
│
├─ Schema changes needed?
│ ├─ New database -> template-schema-design.md
│ ├─ Modify schema -> template-migration.md (Flyway/Liquibase)
│ └─ Large tables (MySQL) -> gh-ost / pt-online-schema-change (avoid locks)
│
├─ High availability setup?
│ ├─ PostgreSQL -> Streaming replication (template-replication-ha.md)
│ └─ MySQL -> GTID-based replication (template-replication-ha.md)
│
├─ Backup/disaster recovery?
│ └─ template-backup-restore.md (pg_dump, mysqldump, PITR)
│
└─ Analytics on large datasets (OLAP)?
└─ See data-lake-platform (ClickHouse, DuckDB, Doris, StarRocks)text
遇到查询性能问题?
├─ 先定位慢查询?
│ ├─ PostgreSQL -> pg_stat_statements(按total_exec_time排序的Top查询)
│ └─ MySQL -> Performance Schema / 慢查询日志
│
├─ 分析执行计划?
│ ├─ PostgreSQL -> EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
│ ├─ MySQL -> EXPLAIN FORMAT=JSON 或 EXPLAIN ANALYZE
│ └─ SQL Server -> SET STATISTICS IO ON; SET STATISTICS TIME ON;
│
├─ 需要索引策略?
│ ├─ PostgreSQL -> B-tree(默认)、GIN(JSONB)、GiST(空间数据)、部分索引
│ ├─ MySQL -> BTREE(默认)、FULLTEXT(文本搜索)、SPATIAL
│ └─ 检查条件:表行数>10k 且 选择性<10% 且 已验证可获得10倍以上性能提升
│
├─ 需要修改Schema?
│ ├─ 新数据库 -> template-schema-design.md
│ ├─ 修改现有Schema -> template-migration.md(基于Flyway/Liquibase)
│ └─ 大表(MySQL)-> gh-ost / pt-online-schema-change(避免锁表)
│
├─ 需要配置高可用?
│ ├─ PostgreSQL -> 流复制(template-replication-ha.md)
│ └─ MySQL -> 基于GTID的复制(template-replication-ha.md)
│
├─ 需要备份/灾难恢复?
│ └─ template-backup-restore.md(pg_dump、mysqldump、时点恢复)
│
└─ 针对大数据集的分析场景(OLAP)?
└─ 请查看data-lake-platform(ClickHouse、DuckDB、Doris、StarRocks)When to Use This Skill
本技能的适用场景
Codex should invoke this skill when users ask for:
当用户提出以下需求时,Codex应调用本技能:
Query Optimization (Modern Approaches)
查询优化(现代方案)
- SQL query performance review and tuning
- EXPLAIN/plan interpretation with optimization suggestions
- Index creation strategies with balanced approach (avoiding over-indexing)
- Troubleshooting slow queries using pg_stat_statements or Performance Schema
- Identifying and remediating SQL anti-patterns with operational fixes
- Query rewrite suggestions or migration from slow to fast patterns
- Statistics maintenance and auto-analyze configuration
- SQL查询性能评审与调优
- 结合优化建议解读EXPLAIN/执行计划
- 采用均衡策略的索引创建方案(避免过度索引)
- 利用pg_stat_statements或Performance Schema排查慢查询
- 识别并修复SQL反模式,提供可落地的解决方案
- 查询重写建议,或从慢查询模式迁移至高效模式
- 统计信息维护与自动分析配置
Database Operations
数据库运维
- Schema design with normalization and performance trade-offs
- Database migrations with version control (Liquibase, Flyway)
- Backup and recovery strategies (point-in-time recovery, automated testing)
- High availability and replication setup (streaming, GTID-based)
- Database security auditing (access controls, encryption, SQL injection prevention)
- Lock analysis and deadlock troubleshooting
- Connection pooling (pgBouncer, Pgpool-II, ProxySQL)
- 兼顾规范化与性能权衡的Schema设计
- 基于版本控制的数据库迁移(Liquibase、Flyway)
- 备份与恢复策略(时点恢复、自动化测试)
- 高可用与复制配置(流复制、基于GTID的复制)
- 数据库安全审计(访问控制、加密、SQL注入防护)
- 锁分析与死锁排查
- 连接池配置(pgBouncer、Pgpool-II、ProxySQL)
Performance Tuning (Modern Standards)
性能调优(现代标准)
- Memory configuration (work_mem, shared_buffers, effective_cache_size)
- Automated monitoring with pg_stat_statements and query pattern analysis
- Index health monitoring (unused index detection, index bloat analysis)
- Vacuum strategy and autovacuum tuning (PostgreSQL)
- InnoDB buffer pool optimization (MySQL)
- Partition pruning improvements (PostgreSQL 18+)
- 内存配置(work_mem、shared_buffers、effective_cache_size)
- 基于pg_stat_statements的自动化监控与查询模式分析
- 索引健康监控(未使用索引检测、索引膨胀分析)
- PostgreSQL的Vacuum策略与自动Vacuum调优
- MySQL的InnoDB缓冲池优化
- PostgreSQL 18+的分区裁剪优化
Resources (Best Practices Guides)
参考资源(最佳实践指南)
Find detailed operational patterns and quick references in:
- SQL Best Practices: references/sql-best-practices.md
- Query Tuning Patterns: references/query-tuning-patterns.md
- Indexing Strategies: references/index-patterns.md
- EXPLAIN/Analysis: references/explain-analysis.md
- SQL Anti-Patterns: references/sql-antipatterns.md
- External Sources: data/sources.json — vendor docs and reference links
- Operational Standards: references/operational-patterns.md — Deep operational checklists, database-specific guidance, and template selection trees
Each file includes:
- Copy-paste ready checklists (e.g., "query review", "index design", "explain review")
- Anti-patterns with operational fixes and alternatives
- Query rewrite and indexing strategies with examples
- Troubleshooting guides (step-by-step)
可在以下文档中获取详细的运维模式与快速参考内容:
- SQL最佳实践:references/sql-best-practices.md
- 查询调优模式:references/query-tuning-patterns.md
- 索引策略:references/index-patterns.md
- EXPLAIN分析:references/explain-analysis.md
- SQL反模式:references/sql-antipatterns.md
- 外部资源:data/sources.json — 厂商文档与参考链接合集
- 运维标准:references/operational-patterns.md — 详细的运维检查清单、数据库专属指南及模板选择决策树
每份文档包含:
- 可直接复制使用的检查清单(如“查询评审”、“索引设计”、“EXPLAIN分析”)
- 附带可落地方案与替代方案的SQL反模式
- 带示例的查询重写与索引策略
- 分步式故障排查指南
Templates (Copy-Paste Ready)
模板(可直接复制使用)
Templates are organized by database technology for precision and clarity:
模板按数据库技术分类,确保精准性与清晰度:
Cross-Platform Templates (All Databases)
跨平台模板(所有数据库通用)
- assets/cross-platform/template-query-tuning.md - Universal query optimization
- assets/cross-platform/template-explain-analysis.md - Execution plan analysis
- assets/cross-platform/template-performance-tuning-worksheet.md - NEW 4-step tuning workflow (Measure -> Explain -> Change -> Verify)
- assets/cross-platform/template-index.md - Index design patterns
- assets/cross-platform/template-slow-query.md - Slow query triage
- assets/cross-platform/template-schema-design.md - Schema modeling
- assets/cross-platform/template-migration.md - Database migrations
- assets/cross-platform/template-backup-restore.md - Backup/DR planning
- assets/cross-platform/template-security-audit.md - Security review
- assets/cross-platform/template-diagnostics.md - Performance diagnostics
- assets/cross-platform/template-lock-analysis.md - Lock troubleshooting
- assets/cross-platform/template-query-tuning.md - 通用查询优化模板
- assets/cross-platform/template-explain-analysis.md - 执行计划分析模板
- assets/cross-platform/template-performance-tuning-worksheet.md - 新增 四步调优工作流(度量→解读→变更→验证)
- assets/cross-platform/template-index.md - 索引设计模式模板
- assets/cross-platform/template-slow-query.md - 慢查询排查模板
- assets/cross-platform/template-schema-design.md - Schema建模模板
- assets/cross-platform/template-migration.md - 数据库迁移模板
- assets/cross-platform/template-backup-restore.md - 备份/灾难恢复规划模板
- assets/cross-platform/template-security-audit.md - 安全审计模板
- assets/cross-platform/template-diagnostics.md - 性能诊断模板
- assets/cross-platform/template-lock-analysis.md - 锁故障排查模板
PostgreSQL Templates
PostgreSQL专属模板
- assets/postgres/template-pg-explain.md - PostgreSQL EXPLAIN analysis
- assets/postgres/template-pg-index.md - PostgreSQL indexing (B-tree, GIN, GiST)
- assets/postgres/template-replication-ha.md - Streaming replication & HA
- assets/postgres/template-pg-explain.md - PostgreSQL EXPLAIN分析模板
- assets/postgres/template-pg-index.md - PostgreSQL索引模板(B-tree、GIN、GiST)
- assets/postgres/template-replication-ha.md - 流复制与高可用模板
MySQL Templates
MySQL专属模板
- assets/mysql/template-mysql-explain.md - MySQL EXPLAIN analysis
- assets/mysql/template-mysql-index.md - MySQL/InnoDB indexing
- assets/mysql/template-replication-ha.md - MySQL replication & HA
- assets/mysql/template-mysql-explain.md - MySQL EXPLAIN分析模板
- assets/mysql/template-mysql-index.md - MySQL/InnoDB索引模板
- assets/mysql/template-replication-ha.md - MySQL复制与高可用模板
Microsoft SQL Server Templates
Microsoft SQL Server专属模板
- assets/mssql/template-mssql-explain.md - SQL Server EXPLAIN/SHOWPLAN analysis
- assets/mssql/template-mssql-index.md - SQL Server indexing and tuning
- assets/mssql/template-mssql-explain.md - SQL Server EXPLAIN/SHOWPLAN分析模板
- assets/mssql/template-mssql-index.md - SQL Server索引与调优模板
Oracle Templates
Oracle专属模板
- assets/oracle/template-oracle-explain.md - Oracle EXPLAIN plan review and tuning
- assets/oracle/template-oracle-explain.md - Oracle EXPLAIN计划评审与调优模板
SQLite Templates
SQLite专属模板
- assets/sqlite/template-sqlite-optimization.md - SQLite optimization and pragma guidance
- assets/sqlite/template-sqlite-optimization.md - SQLite优化与pragma配置指南模板
Related Skills
相关技能
Infrastructure & Operations:
- ../ops-devops-platform/SKILL.md — Infrastructure, backups, monitoring, and incident response
- ../qa-observability/SKILL.md — Performance monitoring, profiling, and metrics
- ../qa-debugging/SKILL.md — Production debugging patterns
Application Integration:
- ../software-backend/SKILL.md — API/database integration and application patterns
- ../software-architecture-design/SKILL.md — System design and data architecture
- ../dev-api-design/SKILL.md — REST API and database interaction patterns
Quality & Security:
- ../qa-resilience/SKILL.md — Resilience, circuit breakers, and failure handling
- ../software-security-appsec/SKILL.md — Database security, auth, SQL injection prevention
- ../qa-testing-strategy/SKILL.md — Database testing strategies
Data Engineering:
- ../ai-ml-data-science/SKILL.md — SQLMesh, dbt, data transformations
- ../ai-mlops/SKILL.md — Data pipelines, ETL, and warehouse loading (dlt)
- ../ai-ml-timeseries/SKILL.md — Time-series databases and forecasting
基础设施与运维:
- ../ops-devops-platform/SKILL.md — 基础设施、备份、监控与事件响应
- ../qa-observability/SKILL.md — 性能监控、性能剖析与指标分析
- ../qa-debugging/SKILL.md — 生产环境故障排查模式
应用集成:
- ../software-backend/SKILL.md — API/数据库集成与应用模式
- ../software-architecture-design/SKILL.md — 系统设计与数据架构
- ../dev-api-design/SKILL.md — REST API与数据库交互模式
质量与安全:
- ../qa-resilience/SKILL.md — 弹性设计、断路器与故障处理
- ../software-security-appsec/SKILL.md — 数据库安全、认证、SQL注入防护
- ../qa-testing-strategy/SKILL.md — 数据库测试策略
数据工程:
- ../ai-ml-data-science/SKILL.md — SQLMesh、dbt、数据转换
- ../ai-mlops/SKILL.md — 数据管道、ETL与数据仓库加载(dlt)
- ../ai-ml-timeseries/SKILL.md — 时序数据库与预测
Navigation
导航
Resources
- references/explain-analysis.md
- references/query-tuning-patterns.md
- references/operational-patterns.md
- references/sql-antipatterns.md
- references/index-patterns.md
- references/sql-best-practices.md
Templates
- assets/cross-platform/template-slow-query.md
- assets/cross-platform/template-backup-restore.md
- assets/cross-platform/template-schema-design.md
- assets/cross-platform/template-explain-analysis.md
- assets/cross-platform/template-performance-tuning-worksheet.md
- assets/cross-platform/template-security-audit.md
- assets/cross-platform/template-diagnostics.md
- assets/cross-platform/template-index.md
- assets/cross-platform/template-migration.md
- assets/cross-platform/template-lock-analysis.md
- assets/cross-platform/template-query-tuning.md
- assets/oracle/template-oracle-explain.md
- assets/sqlite/template-sqlite-optimization.md
- assets/postgres/template-pg-index.md
- assets/postgres/template-replication-ha.md
- assets/postgres/template-pg-explain.md
- assets/mysql/template-mysql-explain.md
- assets/mysql/template-mysql-index.md
- assets/mysql/template-replication-ha.md
- assets/mssql/template-mssql-index.md
- assets/mssql/template-mssql-explain.md
Data
- data/sources.json — Curated external references
参考资源
- references/explain-analysis.md
- references/query-tuning-patterns.md
- references/operational-patterns.md
- references/sql-antipatterns.md
- references/index-patterns.md
- references/sql-best-practices.md
模板
- assets/cross-platform/template-slow-query.md
- assets/cross-platform/template-backup-restore.md
- assets/cross-platform/template-schema-design.md
- assets/cross-platform/template-explain-analysis.md
- assets/cross-platform/template-performance-tuning-worksheet.md
- assets/cross-platform/template-security-audit.md
- assets/cross-platform/template-diagnostics.md
- assets/cross-platform/template-index.md
- assets/cross-platform/template-migration.md
- assets/cross-platform/template-lock-analysis.md
- assets/cross-platform/template-query-tuning.md
- assets/oracle/template-oracle-explain.md
- assets/sqlite/template-sqlite-optimization.md
- assets/postgres/template-pg-index.md
- assets/postgres/template-replication-ha.md
- assets/postgres/template-pg-explain.md
- assets/mysql/template-mysql-explain.md
- assets/mysql/template-mysql-index.md
- assets/mysql/template-replication-ha.md
- assets/mssql/template-mssql-index.md
- assets/mssql/template-mssql-explain.md
数据资源
- data/sources.json — 精选外部参考资源合集
Operational Deep Dives
运维深度解析
See references/operational-patterns.md for:
- End-to-end optimization checklists and anti-pattern fixes
- Database-specific quick references (PostgreSQL, MySQL, SQL Server, Oracle, SQLite)
- Slow query troubleshooting workflow and reliability drills
- Template selection decision tree and platform migration notes
可在references/operational-patterns.md中查看以下内容:
- 端到端的优化检查清单与反模式修复方案
- 数据库专属快速参考(PostgreSQL、MySQL、SQL Server、Oracle、SQLite)
- 慢查询故障排查工作流与可靠性演练
- 模板选择决策树与平台迁移注意事项
Do / Avoid
建议与禁忌
GOOD: Do
推荐做法
- Measure baseline before any optimization
- Change one variable at a time
- Verify results match after query changes
- Update statistics before concluding "needs index"
- Test with production-like data volumes
- Document all optimization decisions
- Include performance tests in CI/CD
- 在进行任何优化前先度量基准性能
- 每次仅变更一个变量
- 查询变更后验证结果一致性
- 在判断“需要添加索引”前先更新统计信息
- 使用类生产数据量进行测试
- 记录所有优化决策
- 在CI/CD中加入性能测试
BAD: Avoid
禁忌做法
- Adding indexes without checking if they'll be used
- Using SELECT * in production queries
- Optimizing for test data (use representative volumes)
- Ignoring write performance impact of indexes
- Skipping EXPLAIN analysis before changes
- Multiple simultaneous changes (can't attribute improvement)
- N+1 query patterns in application code
- 在未确认索引会被使用的情况下添加索引
- 在生产查询中使用SELECT *
- 针对测试数据进行优化(使用具有代表性的数据量)
- 忽略索引对写入性能的影响
- 在变更前跳过EXPLAIN分析
- 同时进行多项变更(无法确定性能提升的来源)
- 在应用代码中使用N+1查询模式
Anti-Patterns Quick Reference
SQL反模式快速参考
| Anti-Pattern | Problem | Fix |
|---|---|---|
| **SELECT *** | Reads unnecessary columns | Explicit column list |
| N+1 queries | Multiplied round trips | JOIN or batch fetch |
| Missing WHERE | Full table scan | Add predicates |
| Function on indexed column | Can't use index | Move function to RHS |
| Implicit type conversion | Index bypass | Match types explicitly |
| LIKE '%prefix' | Leading wildcard = scan | Full-text search |
| Unbounded result set | Memory explosion | Add LIMIT/pagination |
| OR conditions | Index may not be used | UNION or rewrite |
See references/sql-antipatterns.md for detailed fixes.
| 反模式 | 问题 | 修复方案 |
|---|---|---|
| **SELECT *** | 读取不必要的列 | 显式指定需要查询的列 |
| N+1查询 | 大幅增加数据库往返次数 | 使用JOIN或批量查询 |
| 缺失WHERE子句 | 全表扫描 | 添加过滤条件 |
| 在索引列上使用函数 | 无法利用索引 | 将函数移至等式右侧 |
| 隐式类型转换 | 跳过索引 | 显式匹配数据类型 |
| LIKE '%前缀' | 前导通配符导致全表扫描 | 使用全文搜索 |
| 无限制结果集 | 内存占用过高 | 添加LIMIT或分页 |
| OR条件 | 可能无法使用索引 | 使用UNION或重写查询 |
详细修复方案请查看references/sql-antipatterns.md。
OLTP vs OLAP Decision Tree
OLTP与OLAP决策树
text
Is your query for...?
├─ Point lookups (by ID/key)?
│ └─ OLTP database (this skill)
│ - Ensure proper indexes
│ - Use connection pooling
│ - Optimize for low latency
│
├─ Aggregations over recent data (dashboard)?
│ └─ OLTP database (this skill)
│ - Consider materialized views
│ - Index common filter columns
│ - Watch for lock contention
│
├─ Full table scans or historical analysis?
│ └─ OLAP database (data-lake-platform)
│ - ClickHouse, DuckDB, Doris
│ - Columnar storage
│ - Partitioning by date
│
└─ Mixed workload (both)?
└─ Separate OLTP and OLAP
- OLTP for transactions
- Replicate to OLAP for analytics
- Avoid running analytics on primarytext
你的查询用于?
├─ 点查询(按ID/键查询)?
│ └─ 使用OLTP数据库(本技能适用)
│ - 确保配置了合适的索引
│ - 使用连接池
│ - 针对低延迟进行优化
│
├─ 近期数据聚合分析(仪表盘)?
│ └─ 使用OLTP数据库(本技能适用)
│ - 考虑使用物化视图
│ - 为常用过滤列添加索引
│ - 注意锁竞争问题
│
├─ 全表扫描或历史数据分析?
│ └─ 使用OLAP数据库(data-lake-platform)
│ - ClickHouse、DuckDB、Doris
│ - 列式存储
│ - 按日期分区
│
└─ 混合负载(同时包含以上场景)?
└─ 分离OLTP与OLAP系统
- OLTP用于处理事务
- 将数据复制到OLAP系统进行分析
- 避免在主库上运行分析查询Optional: AI/Automation
可选:AI/自动化工具
Note: AI tools assist but require human validation of correctness.
- EXPLAIN summarization — Identify bottlenecks from complex plans
- Query rewrite suggestions — Must verify result equivalence
- Index recommendations — Check selectivity and write impact first
注意:AI工具仅作为辅助,优化结果需人工验证正确性。
- EXPLAIN摘要生成 — 从复杂执行计划中识别瓶颈
- 查询重写建议 — 必须验证结果的等价性
- 索引推荐 — 先检查选择性与写入性能影响
Bounded Claims
局限性说明
- AI cannot determine correct query results
- Automated index suggestions may miss workload context
- Human review required for production changes
- AI无法判断查询结果的正确性
- 自动化索引建议可能忽略业务负载上下文
- 生产环境变更必须经过人工评审
Analytical Databases (OLAP)
分析型数据库(OLAP)
For OLAP databases and data lake infrastructure, see data-lake-platform:
- Query engines: ClickHouse, DuckDB, Apache Doris, StarRocks
- Table formats: Apache Iceberg, Delta Lake, Apache Hudi
- Transformation: SQLMesh, dbt (staging/marts layers)
- Ingestion: dlt, Airbyte (connectors)
- Streaming: Apache Kafka patterns
This skill focuses on transactional database optimization (PostgreSQL, MySQL, SQL Server, Oracle, SQLite). Use data-lake-platform for analytical workloads.
关于OLAP数据库与数据湖基础设施,请查看**data-lake-platform**:
- 查询引擎:ClickHouse、DuckDB、Apache Doris、StarRocks
- 表格式:Apache Iceberg、Delta Lake、Apache Hudi
- 数据转换:SQLMesh、dbt( staging/marts层)
- 数据摄入:dlt、Airbyte(连接器)
- 流处理:Apache Kafka模式
本技能专注于事务型数据库优化(PostgreSQL、MySQL、SQL Server、Oracle、SQLite)。分析型负载请使用data-lake-platform技能。
Related Skills
相关技能
This skill focuses on query optimization within a single database. For related workflows:
SQL Transformation & Analytics Engineering:
-> ai-ml-data-science skill
- SQLMesh templates for building staging/intermediate/marts layers
- Incremental models (FULL, INCREMENTAL_BY_TIME_RANGE, INCREMENTAL_BY_UNIQUE_KEY)
- DAG management and model dependencies
- Unit tests and audits for SQL transformations
Data Ingestion (Loading into Warehouses):
-> ai-mlops skill
- dlt templates for extracting from REST APIs, databases
- Loading to Snowflake, BigQuery, Redshift, Postgres, DuckDB
- Incremental loading patterns (timestamp, ID-based, merge/upsert)
- Database replication (Postgres, MySQL, MongoDB -> warehouse)
Data Lake Infrastructure:
-> data-lake-platform skill
- ClickHouse, DuckDB, Doris, StarRocks query engines
- Iceberg, Delta Lake, Hudi table formats
- Kafka streaming, Dagster/Airflow orchestration
Use Case Decision:
- Query is slow in production -> Use this skill (data-sql-optimization)
- Building feature pipelines in SQL -> Use ai-ml-data-science (SQLMesh)
- Loading data from APIs/DBs to warehouse -> Use ai-mlops (dlt)
- Analytics on large datasets (OLAP) -> Use data-lake-platform
本技能专注于单数据库内的查询优化。相关工作流请参考以下技能:
SQL转换与分析工程:
-> ai-ml-data-science 技能
- 用于构建staging/intermediate/marts层的SQLMesh模板
- 增量模型(FULL、INCREMENTAL_BY_TIME_RANGE、INCREMENTAL_BY_UNIQUE_KEY)
- DAG管理与模型依赖
- SQL转换的单元测试与审计
数据摄入(加载至数据仓库):
-> ai-mlops 技能
- 用于从REST API、数据库提取数据的dlt模板
- 加载至Snowflake、BigQuery、Redshift、Postgres、DuckDB
- 增量加载模式(基于时间戳、ID、合并/更新)
- 数据库复制(Postgres、MySQL、MongoDB → 数据仓库)
数据湖基础设施:
-> data-lake-platform 技能
- ClickHouse、DuckDB、Doris、StarRocks查询引擎
- Iceberg、Delta Lake、Hudi表格式
- Kafka流处理、Dagster/Airflow编排
场景决策:
- 生产环境查询变慢 -> 使用本技能(data-sql-optimization)
- 使用SQL构建特征管道 -> 使用ai-ml-data-science(SQLMesh)
- 从API/数据库加载数据至数据仓库 -> 使用ai-mlops(dlt)
- 大数据集分析(OLAP) -> 使用data-lake-platform
External Resources
外部资源
See data/sources.json for 62+ curated resources including:
Core Documentation:
- RDBMS Documentation: PostgreSQL, MySQL, SQL Server, Oracle, SQLite, DuckDB official docs
- Query Optimization: Use The Index, Luke, SQL Performance Explained, vendor optimization guides
- Schema Design: Database Refactoring (Fowler), normalization guides, data type selection
Modern Optimization (Current):
- PostgreSQL: official release notes and "current" docs for planner/optimizer changes
- MySQL: official reference manual sections for EXPLAIN, optimizer, and Performance Schema
- SQL Server / Oracle: official docs for execution plans, indexing, and concurrency controls
Operations & Infrastructure:
- HA & Replication: Streaming replication, GTID-based replication, failover automation
- Migrations: Liquibase, Flyway version control and deployment patterns
- Backup/Recovery: pgBackRest, Percona XtraBackup, point-in-time recovery
- Monitoring: pg_stat_statements, Performance Schema, EXPLAIN visualizers (Dalibo, depesz)
- Security: OWASP SQL Injection Prevention, Postgres hardening, encryption standards
- Analytical Databases: DuckDB extensions, Parquet specification, columnar storage patterns
Use references/operational-patterns.md and the templates directory for detailed workflows, migration notes, and ready-to-run commands.
data/sources.json中包含62+精选资源,包括:
核心文档:
- 关系型数据库文档:PostgreSQL、MySQL、SQL Server、Oracle、SQLite、DuckDB官方文档
- 查询优化:《Use The Index, Luke》、《SQL Performance Explained》、厂商优化指南
- Schema设计:《Database Refactoring》(Fowler)、规范化指南、数据类型选择
现代优化(最新内容):
- PostgreSQL:官方发布说明与关于查询规划器/优化器变更的“当前”文档
- MySQL:官方参考手册中关于EXPLAIN、优化器与Performance Schema的章节
- SQL Server / Oracle:关于执行计划、索引与并发控制的官方文档
运维与基础设施:
- 高可用与复制:流复制、基于GTID的复制、故障切换自动化
- 迁移:Liquibase、Flyway版本控制与部署模式
- 备份/恢复:pgBackRest、Percona XtraBackup、时点恢复
- 监控:pg_stat_statements、Performance Schema、EXPLAIN可视化工具(Dalibo、depesz)
- 安全:OWASP SQL注入防护、Postgres加固、加密标准
- 分析型数据库:DuckDB扩展、Parquet规范、列式存储模式
详细工作流、迁移注意事项与可直接运行的命令,请查看references/operational-patterns.md与模板目录。