database-indexing

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Database Indexing & Query Optimization

数据库索引与查询优化

Strategies for optimizing database queries through proper indexing and schema design.
通过合理的索引和Schema设计优化数据库查询的策略。

Index Types

索引类型

B-Tree Index

B-Tree Index

  • Default for most databases (MySQL, PostgreSQL)
  • Balanced tree structure
  • Good for range queries and sorting
  • 大多数数据库的默认索引类型(MySQL、PostgreSQL)
  • 平衡树结构
  • 适用于范围查询和排序

Hash Index

Hash Index

  • O(1) lookup for equality
  • Not suitable for range queries
  • Fast point lookups
  • 等值查询可实现O(1)查找性能
  • 不适用于范围查询
  • 快速点查询

Full-Text Index

Full-Text Index

  • Optimized for text search
  • Language-specific analysis
  • Used with text search queries
  • 针对文本搜索优化
  • 支持特定语言分析
  • 用于文本搜索查询

Spatial Index

Spatial Index

  • R-tree, Quadtree for geographic data
  • Optimized for spatial queries
  • 用于地理数据的R-tree、Quadtree结构
  • 针对空间查询优化

Composite Index

Composite Index

  • Multiple columns in one index
  • Column order matters (leftmost prefix)
  • 单个索引包含多列
  • 列的顺序很重要(最左前缀原则)

Query Optimization Techniques

查询优化技巧

EXPLAIN Plans

EXPLAIN Plans

sql
EXPLAIN ANALYZE SELECT * FROM users WHERE id = 1;
sql
EXPLAIN ANALYZE SELECT * FROM users WHERE id = 1;

Index Selection

索引选择

  1. Look for WHERE clause columns
  2. Consider JOIN conditions
  3. Evaluate sorting/grouping columns
  4. Check cardinality (selectivity)
  1. 关注WHERE子句中的列
  2. 考虑JOIN条件
  3. 评估排序/分组列
  4. 检查基数(选择性)

Avoid Common Mistakes

避免常见错误

  • Creating indexes on low-cardinality columns
  • Creating unused indexes
  • Over-indexing (write performance impact)
  • Not analyzing index usage
  • 在低基数列上创建索引
  • 创建未使用的索引
  • 过度索引(影响写入性能)
  • 未分析索引使用情况

Performance Tuning

性能调优

  1. Analyze queries - Use EXPLAIN
  2. Identify bottlenecks - Query profiling
  3. Test thoroughly - Before/after metrics
  4. Monitor regularly - Track performance changes
  5. Denormalize carefully - Balance read vs write
  6. Archive old data - Keep active data small
  7. Partition tables - Handle large datasets
  1. 分析查询 - 使用EXPLAIN
  2. 识别瓶颈 - 查询性能分析
  3. 彻底测试 - 对比调优前后指标
  4. 定期监控 - 跟踪性能变化
  5. 谨慎反规范化 - 平衡读写性能
  6. 归档旧数据 - 保持活跃数据量精简
  7. 分区表 - 处理大型数据集

Schema Design

Schema设计

  • Normalization - Reduce redundancy
  • Appropriate data types - Use INT not VARCHAR for IDs
  • Foreign keys - Maintain referential integrity
  • Constraints - Enforce data quality
  • 规范化 - 减少数据冗余
  • 合适的数据类型 - 用INT而非VARCHAR存储ID
  • 外键 - 维护引用完整性
  • 约束 - 保证数据质量

Tools & Commands

工具与命令

PostgreSQL:
sql
CREATE INDEX idx_users_email ON users(email);
DROP INDEX idx_users_email;
ANALYZE;
MySQL:
sql
EXPLAIN analyzer SELECT * FROM users WHERE email = 'test@example.com';
CREATE INDEX idx_email ON users(email);
PostgreSQL:
sql
CREATE INDEX idx_users_email ON users(email);
DROP INDEX idx_users_email;
ANALYZE;
MySQL:
sql
EXPLAIN analyzer SELECT * FROM users WHERE email = 'test@example.com';
CREATE INDEX idx_email ON users(email);

References

参考资料

  • PostgreSQL Index Documentation
  • MySQL Performance Tuning
  • Database Query Optimization Principles
  • Use the Index, Luke! (Free online book)
  • PostgreSQL Index Documentation
  • MySQL Performance Tuning
  • Database Query Optimization Principles
  • Use the Index, Luke! (免费在线书籍)