query-expert

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Query Expert

查询专家

Master database queries across SQL and NoSQL systems. Generate optimized queries, analyze performance with EXPLAIN plans, design effective indexes, and troubleshoot slow queries.
精通SQL与NoSQL系统下的数据库查询。生成优化查询语句,借助EXPLAIN计划分析性能,设计高效索引并排查慢查询问题。

What This Skill Does

该技能的功能

Helps you write efficient, performant database queries:
  • Generate Queries - SQL, MongoDB, GraphQL queries
  • Optimize Queries - Performance tuning and refactoring
  • Design Indexes - Index strategies for faster queries
  • Analyze Performance - EXPLAIN plans and query analysis
  • Troubleshoot - Debug slow queries and bottlenecks
  • Best Practices - Query patterns and anti-patterns
帮助你编写高效、高性能的数据库查询:
  • 生成查询语句 - SQL、MongoDB、GraphQL查询
  • 优化查询语句 - 性能调优与重构
  • 设计索引 - 提升查询速度的索引策略
  • 分析性能 - EXPLAIN计划与查询分析
  • 排查问题 - 调试慢查询与性能瓶颈
  • 最佳实践 - 查询模式与反模式

Supported Databases

支持的数据库

SQL Databases

SQL数据库

  • PostgreSQL - Advanced features, CTEs, window functions
  • MySQL/MariaDB - InnoDB optimization, replication
  • SQLite - Embedded database optimization
  • SQL Server - T-SQL, execution plans, DMVs
  • Oracle - PL/SQL, partitioning, hints
  • PostgreSQL - 高级特性、CTE、窗口函数
  • MySQL/MariaDB - InnoDB优化、复制
  • SQLite - 嵌入式数据库优化
  • SQL Server - T-SQL、执行计划、DMV
  • Oracle - PL/SQL、分区、提示

NoSQL Databases

NoSQL数据库

  • MongoDB - Aggregation pipelines, indexes
  • Redis - Key-value queries, Lua scripts
  • Elasticsearch - Full-text search queries
  • Cassandra - CQL, partition keys
  • MongoDB - 聚合管道、索引
  • Redis - 键值查询、Lua脚本
  • Elasticsearch - 全文搜索查询
  • Cassandra - CQL、分区键

Query Languages

查询语言

  • SQL - Standard and vendor-specific
  • MongoDB Query Language - Find, aggregation
  • GraphQL - Efficient data fetching
  • Cypher - Neo4j graph queries
  • SQL - 标准与厂商特定语法
  • MongoDB Query Language - Find、聚合
  • GraphQL - 高效数据获取
  • Cypher - Neo4j图查询

SQL Query Patterns

SQL查询模式

SELECT Queries

SELECT查询

Basic SELECT

基础SELECT

sql
-- ✅ Select only needed columns
SELECT
    user_id,
    email,
    created_at
FROM users
WHERE status = 'active'
    AND created_at > NOW() - INTERVAL '30 days'
ORDER BY created_at DESC
LIMIT 100;

-- ❌ Avoid SELECT *
SELECT * FROM users;  -- Wastes resources
sql
-- ✅ Select only needed columns
SELECT
    user_id,
    email,
    created_at
FROM users
WHERE status = 'active'
    AND created_at > NOW() - INTERVAL '30 days'
ORDER BY created_at DESC
LIMIT 100;

-- ❌ Avoid SELECT *
SELECT * FROM users;  -- Wastes resources

JOINs

JOINs

sql
-- INNER JOIN (most common)
SELECT
    o.order_id,
    o.total,
    c.name AS customer_name,
    c.email
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id
WHERE o.created_at >= '2024-01-01';

-- LEFT JOIN (include all left rows)
SELECT
    c.customer_id,
    c.name,
    COUNT(o.order_id) AS order_count,
    COALESCE(SUM(o.total), 0) AS total_spent
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name;

-- Multiple JOINs
SELECT
    o.order_id,
    c.name AS customer_name,
    p.product_name,
    oi.quantity,
    oi.price
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id
INNER JOIN order_items oi ON o.order_id = oi.order_id
INNER JOIN products p ON oi.product_id = p.product_id
WHERE o.status = 'completed';
sql
-- INNER JOIN (most common)
SELECT
    o.order_id,
    o.total,
    c.name AS customer_name,
    c.email
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id
WHERE o.created_at >= '2024-01-01';

-- LEFT JOIN (include all left rows)
SELECT
    c.customer_id,
    c.name,
    COUNT(o.order_id) AS order_count,
    COALESCE(SUM(o.total), 0) AS total_spent
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name;

-- Multiple JOINs
SELECT
    o.order_id,
    c.name AS customer_name,
    p.product_name,
    oi.quantity,
    oi.price
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id
INNER JOIN order_items oi ON o.order_id = oi.order_id
INNER JOIN products p ON oi.product_id = p.product_id
WHERE o.status = 'completed';

Subqueries

子查询

sql
-- Subquery in WHERE
SELECT name, email
FROM customers
WHERE customer_id IN (
    SELECT DISTINCT customer_id
    FROM orders
    WHERE total > 1000
);

-- Correlated subquery
SELECT
    c.name,
    (SELECT COUNT(*)
     FROM orders o
     WHERE o.customer_id = c.customer_id) AS order_count
FROM customers c;

-- ✅ Better: Use JOIN instead
SELECT
    c.name,
    COUNT(o.order_id) AS order_count
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name;
sql
-- Subquery in WHERE
SELECT name, email
FROM customers
WHERE customer_id IN (
    SELECT DISTINCT customer_id
    FROM orders
    WHERE total > 1000
);

-- Correlated subquery
SELECT
    c.name,
    (SELECT COUNT(*)
     FROM orders o
     WHERE o.customer_id = c.customer_id) AS order_count
FROM customers c;

-- ✅ Better: Use JOIN instead
SELECT
    c.name,
    COUNT(o.order_id) AS order_count
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name;

Aggregation

聚合

sql
-- GROUP BY with aggregates
SELECT
    category,
    COUNT(*) AS product_count,
    AVG(price) AS avg_price,
    MIN(price) AS min_price,
    MAX(price) AS max_price,
    SUM(stock_quantity) AS total_stock
FROM products
GROUP BY category
HAVING COUNT(*) > 5
ORDER BY avg_price DESC;

-- Multiple GROUP BY columns
SELECT
    DATE_TRUNC('month', created_at) AS month,
    category,
    SUM(total) AS monthly_sales
FROM orders
GROUP BY DATE_TRUNC('month', created_at), category
ORDER BY month DESC, monthly_sales DESC;

-- ROLLUP for subtotals
SELECT
    COALESCE(category, 'TOTAL') AS category,
    COALESCE(brand, 'All Brands') AS brand,
    SUM(sales) AS total_sales
FROM products
GROUP BY ROLLUP(category, brand);
sql
-- GROUP BY with aggregates
SELECT
    category,
    COUNT(*) AS product_count,
    AVG(price) AS avg_price,
    MIN(price) AS min_price,
    MAX(price) AS max_price,
    SUM(stock_quantity) AS total_stock
FROM products
GROUP BY category
HAVING COUNT(*) > 5
ORDER BY avg_price DESC;

-- Multiple GROUP BY columns
SELECT
    DATE_TRUNC('month', created_at) AS month,
    category,
    SUM(total) AS monthly_sales
FROM orders
GROUP BY DATE_TRUNC('month', created_at), category
ORDER BY month DESC, monthly_sales DESC;

-- ROLLUP for subtotals
SELECT
    COALESCE(category, 'TOTAL') AS category,
    COALESCE(brand, 'All Brands') AS brand,
    SUM(sales) AS total_sales
FROM products
GROUP BY ROLLUP(category, brand);

Window Functions (PostgreSQL, SQL Server, MySQL 8+)

窗口函数(PostgreSQL、SQL Server、MySQL 8+)

sql
-- ROW_NUMBER
SELECT
    customer_id,
    order_date,
    total,
    ROW_NUMBER() OVER (
        PARTITION BY customer_id
        ORDER BY order_date DESC
    ) AS order_rank
FROM orders;

-- Running totals
SELECT
    order_date,
    total,
    SUM(total) OVER (
        ORDER BY order_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total
FROM orders;

-- RANK vs DENSE_RANK
SELECT
    product_name,
    sales,
    RANK() OVER (ORDER BY sales DESC) AS rank,
    DENSE_RANK() OVER (ORDER BY sales DESC) AS dense_rank,
    NTILE(4) OVER (ORDER BY sales DESC) AS quartile
FROM products;

-- LAG and LEAD
SELECT
    order_date,
    total,
    LAG(total, 1) OVER (ORDER BY order_date) AS prev_total,
    LEAD(total, 1) OVER (ORDER BY order_date) AS next_total,
    total - LAG(total, 1) OVER (ORDER BY order_date) AS change
FROM orders;
sql
-- ROW_NUMBER
SELECT
    customer_id,
    order_date,
    total,
    ROW_NUMBER() OVER (
        PARTITION BY customer_id
        ORDER BY order_date DESC
    ) AS order_rank
FROM orders;

-- Running totals
SELECT
    order_date,
    total,
    SUM(total) OVER (
        ORDER BY order_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total
FROM orders;

-- RANK vs DENSE_RANK
SELECT
    product_name,
    sales,
    RANK() OVER (ORDER BY sales DESC) AS rank,
    DENSE_RANK() OVER (ORDER BY sales DESC) AS dense_rank,
    NTILE(4) OVER (ORDER BY sales DESC) AS quartile
FROM products;

-- LAG and LEAD
SELECT
    order_date,
    total,
    LAG(total, 1) OVER (ORDER BY order_date) AS prev_total,
    LEAD(total, 1) OVER (ORDER BY order_date) AS next_total,
    total - LAG(total, 1) OVER (ORDER BY order_date) AS change
FROM orders;

CTEs (Common Table Expressions)

CTE(公共表表达式)

sql
-- Simple CTE
WITH active_customers AS (
    SELECT customer_id, name, email
    FROM customers
    WHERE status = 'active'
)
SELECT
    ac.name,
    COUNT(o.order_id) AS order_count
FROM active_customers ac
LEFT JOIN orders o ON ac.customer_id = o.customer_id
GROUP BY ac.customer_id, ac.name;

-- Multiple CTEs
WITH
monthly_sales AS (
    SELECT
        DATE_TRUNC('month', order_date) AS month,
        SUM(total) AS sales
    FROM orders
    GROUP BY DATE_TRUNC('month', order_date)
),
avg_monthly AS (
    SELECT AVG(sales) AS avg_sales
    FROM monthly_sales
)
SELECT
    ms.month,
    ms.sales,
    am.avg_sales,
    ms.sales - am.avg_sales AS variance
FROM monthly_sales ms
CROSS JOIN avg_monthly am
ORDER BY ms.month;

-- Recursive CTE (hierarchies)
WITH RECURSIVE org_tree AS (
    -- Base case
    SELECT
        employee_id,
        name,
        manager_id,
        1 AS level,
        ARRAY[employee_id] AS path
    FROM employees
    WHERE manager_id IS NULL

    UNION ALL

    -- Recursive case
    SELECT
        e.employee_id,
        e.name,
        e.manager_id,
        ot.level + 1,
        ot.path || e.employee_id
    FROM employees e
    INNER JOIN org_tree ot ON e.manager_id = ot.employee_id
)
SELECT * FROM org_tree ORDER BY path;
sql
-- Simple CTE
WITH active_customers AS (
    SELECT customer_id, name, email
    FROM customers
    WHERE status = 'active'
)
SELECT
    ac.name,
    COUNT(o.order_id) AS order_count
FROM active_customers ac
LEFT JOIN orders o ON ac.customer_id = o.customer_id
GROUP BY ac.customer_id, ac.name;

-- Multiple CTEs
WITH
monthly_sales AS (
    SELECT
        DATE_TRUNC('month', order_date) AS month,
        SUM(total) AS sales
    FROM orders
    GROUP BY DATE_TRUNC('month', order_date)
),
avg_monthly AS (
    SELECT AVG(sales) AS avg_sales
    FROM monthly_sales
)
SELECT
    ms.month,
    ms.sales,
    am.avg_sales,
    ms.sales - am.avg_sales AS variance
FROM monthly_sales ms
CROSS JOIN avg_monthly am
ORDER BY ms.month;

-- Recursive CTE (hierarchies)
WITH RECURSIVE org_tree AS (
    -- Base case
    SELECT
        employee_id,
        name,
        manager_id,
        1 AS level,
        ARRAY[employee_id] AS path
    FROM employees
    WHERE manager_id IS NULL

    UNION ALL

    -- Recursive case
    SELECT
        e.employee_id,
        e.name,
        e.manager_id,
        ot.level + 1,
        ot.path || e.employee_id
    FROM employees e
    INNER JOIN org_tree ot ON e.manager_id = ot.employee_id
)
SELECT * FROM org_tree ORDER BY path;

Query Optimization

查询优化

1. Use Indexes Effectively

1. 高效使用索引

sql
-- Create index on frequently queried columns
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date);

-- Composite index (order matters!)
CREATE INDEX idx_orders_composite
ON orders(status, customer_id, order_date);

-- ✅ This query uses the index
SELECT * FROM orders
WHERE status = 'pending'
    AND customer_id = 123
    AND order_date > '2024-01-01';

-- ❌ This doesn't use the index (skips first column)
SELECT * FROM orders
WHERE customer_id = 123;

-- Partial/Filtered index (smaller, faster)
CREATE INDEX idx_active_users
ON users(email)
WHERE status = 'active';

-- Covering index (includes all needed columns)
CREATE INDEX idx_users_covering
ON users(email)
INCLUDE (name, created_at);
sql
-- Create index on frequently queried columns
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date);

-- Composite index (order matters!)
CREATE INDEX idx_orders_composite
ON orders(status, customer_id, order_date);

-- ✅ This query uses the index
SELECT * FROM orders
WHERE status = 'pending'
    AND customer_id = 123
    AND order_date > '2024-01-01';

-- ❌ This doesn't use the index (skips first column)
SELECT * FROM orders
WHERE customer_id = 123;

-- Partial/Filtered index (smaller, faster)
CREATE INDEX idx_active_users
ON users(email)
WHERE status = 'active';

-- Covering index (includes all needed columns)
CREATE INDEX idx_users_covering
ON users(email)
INCLUDE (name, created_at);

2. Avoid SELECT *

2. 避免使用SELECT *

sql
-- ❌ Bad: Retrieves all columns
SELECT * FROM users;

-- ✅ Good: Select only needed columns
SELECT user_id, email, name FROM users;

-- ✅ Good: More efficient for joins
SELECT
    u.user_id,
    u.email,
    o.order_id,
    o.total
FROM users u
INNER JOIN orders o ON u.user_id = o.user_id;
sql
-- ❌ Bad: Retrieves all columns
SELECT * FROM users;

-- ✅ Good: Select only needed columns
SELECT user_id, email, name FROM users;

-- ✅ Good: More efficient for joins
SELECT
    u.user_id,
    u.email,
    o.order_id,
    o.total
FROM users u
INNER JOIN orders o ON u.user_id = o.user_id;

3. Optimize JOINs

3. 优化JOIN操作

sql
-- ❌ Bad: Filtering after JOIN
SELECT u.name, o.total
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id
WHERE o.status = 'completed';

-- ✅ Good: Filter before JOIN
SELECT u.name, o.total
FROM users u
INNER JOIN (
    SELECT user_id, total
    FROM orders
    WHERE status = 'completed'
) o ON u.user_id = o.user_id;

-- ✅ Even better: Use WHERE with INNER JOIN
SELECT u.name, o.total
FROM users u
INNER JOIN orders o ON u.user_id = o.user_id
WHERE o.status = 'completed';
sql
-- ❌ Bad: Filtering after JOIN
SELECT u.name, o.total
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id
WHERE o.status = 'completed';

-- ✅ Good: Filter before JOIN
SELECT u.name, o.total
FROM users u
INNER JOIN (
    SELECT user_id, total
    FROM orders
    WHERE status = 'completed'
) o ON u.user_id = o.user_id;

-- ✅ Even better: Use WHERE with INNER JOIN
SELECT u.name, o.total
FROM users u
INNER JOIN orders o ON u.user_id = o.user_id
WHERE o.status = 'completed';

4. Use EXISTS Instead of IN

4. 使用EXISTS替代IN

sql
-- ❌ Slower: IN with subquery
SELECT name FROM customers
WHERE customer_id IN (
    SELECT customer_id FROM orders WHERE total > 1000
);

-- ✅ Faster: EXISTS
SELECT name FROM customers c
WHERE EXISTS (
    SELECT 1 FROM orders o
    WHERE o.customer_id = c.customer_id
        AND o.total > 1000
);
sql
-- ❌ Slower: IN with subquery
SELECT name FROM customers
WHERE customer_id IN (
    SELECT customer_id FROM orders WHERE total > 1000
);

-- ✅ Faster: EXISTS
SELECT name FROM customers c
WHERE EXISTS (
    SELECT 1 FROM orders o
    WHERE o.customer_id = c.customer_id
        AND o.total > 1000
);

5. Avoid Functions on Indexed Columns

5. 避免在索引列上使用函数

sql
-- ❌ Bad: Function prevents index usage
SELECT * FROM users
WHERE LOWER(email) = 'john@example.com';

-- ✅ Good: Use functional index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

-- Or use case-insensitive collation
SELECT * FROM users
WHERE email = 'john@example.com' COLLATE utf8_general_ci;
sql
-- ❌ Bad: Function prevents index usage
SELECT * FROM users
WHERE LOWER(email) = 'john@example.com';

-- ✅ Good: Use functional index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

-- Or use case-insensitive collation
SELECT * FROM users
WHERE email = 'john@example.com' COLLATE utf8_general_ci;

6. Limit Result Sets

6. 限制结果集

sql
-- ✅ Use LIMIT/TOP for pagination
SELECT * FROM orders
ORDER BY created_at DESC
LIMIT 20 OFFSET 0;

-- ✅ Use WHERE to reduce rows early
SELECT * FROM orders
WHERE created_at > NOW() - INTERVAL '7 days'
ORDER BY created_at DESC;
sql
-- ✅ Use LIMIT/TOP for pagination
SELECT * FROM orders
ORDER BY created_at DESC
LIMIT 20 OFFSET 0;

-- ✅ Use WHERE to reduce rows early
SELECT * FROM orders
WHERE created_at > NOW() - INTERVAL '7 days'
ORDER BY created_at DESC;

7. Batch Operations

7. 批量操作

sql
-- ❌ Bad: Multiple single inserts
INSERT INTO users (name, email) VALUES ('User1', 'user1@example.com');
INSERT INTO users (name, email) VALUES ('User2', 'user2@example.com');

-- ✅ Good: Batch insert
INSERT INTO users (name, email) VALUES
    ('User1', 'user1@example.com'),
    ('User2', 'user2@example.com'),
    ('User3', 'user3@example.com');

-- ✅ Good: Batch update
UPDATE products
SET price = price * 1.1
WHERE category IN ('Electronics', 'Computers');
sql
-- ❌ Bad: Multiple single inserts
INSERT INTO users (name, email) VALUES ('User1', 'user1@example.com');
INSERT INTO users (name, email) VALUES ('User2', 'user2@example.com');

-- ✅ Good: Batch insert
INSERT INTO users (name, email) VALUES
    ('User1', 'user1@example.com'),
    ('User2', 'user2@example.com'),
    ('User3', 'user3@example.com');

-- ✅ Good: Batch update
UPDATE products
SET price = price * 1.1
WHERE category IN ('Electronics', 'Computers');

EXPLAIN Plans

EXPLAIN计划

PostgreSQL

PostgreSQL

sql
-- Simple EXPLAIN
EXPLAIN
SELECT * FROM orders WHERE customer_id = 123;

-- EXPLAIN ANALYZE (actually runs query)
EXPLAIN ANALYZE
SELECT
    c.name,
    COUNT(o.order_id) AS order_count
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name;

-- Look for:
-- - Seq Scan (bad, needs index)
-- - Index Scan (good)
-- - Bitmap Heap Scan (good for multiple rows)
-- - Hash Join vs Nested Loop
-- - High cost numbers
sql
-- Simple EXPLAIN
EXPLAIN
SELECT * FROM orders WHERE customer_id = 123;

-- EXPLAIN ANALYZE (actually runs query)
EXPLAIN ANALYZE
SELECT
    c.name,
    COUNT(o.order_id) AS order_count
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name;

-- Look for:
-- - Seq Scan (bad, needs index)
-- - Index Scan (good)
-- - Bitmap Heap Scan (good for multiple rows)
-- - Hash Join vs Nested Loop
-- - High cost numbers

MySQL

MySQL

sql
-- EXPLAIN
EXPLAIN
SELECT * FROM orders WHERE customer_id = 123;

-- EXPLAIN ANALYZE (MySQL 8.0.18+)
EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 123;

-- Look for:
-- - type: ALL (table scan, bad)
-- - type: index (index scan, good)
-- - type: ref (index lookup, great)
-- - Extra: Using filesort (may need index)
-- - Extra: Using temporary (may need optimization)
sql
-- EXPLAIN
EXPLAIN
SELECT * FROM orders WHERE customer_id = 123;

-- EXPLAIN ANALYZE (MySQL 8.0.18+)
EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 123;

-- Look for:
-- - type: ALL (table scan, bad)
-- - type: index (index scan, good)
-- - type: ref (index lookup, great)
-- - Extra: Using filesort (may need index)
-- - Extra: Using temporary (may need optimization)

Indexing Strategies

索引策略

When to Index

何时创建索引

✅ Index these columns:
  • Primary keys (automatic)
  • Foreign keys
  • Columns in WHERE clauses
  • Columns in JOIN conditions
  • Columns in ORDER BY
  • Columns in GROUP BY
❌ Don't index:
  • Small tables (< 1000 rows)
  • Columns with low cardinality (few distinct values)
  • Frequently updated columns
  • Large text/blob columns
✅ 为以下列创建索引:
  • 主键(自动创建)
  • 外键
  • WHERE子句中的列
  • JOIN条件中的列
  • ORDER BY中的列
  • GROUP BY中的列
❌ 不要为以下列创建索引:
  • 小表(少于1000行)
  • 低基数列(不同值很少)
  • 频繁更新的列
  • 大文本/blob列

Index Types

索引类型

sql
-- B-Tree (default, most common)
CREATE INDEX idx_users_email ON users(email);

-- Hash index (equality only, PostgreSQL)
CREATE INDEX idx_users_email_hash ON users USING HASH(email);

-- GIN (full-text search, arrays, JSONB)
CREATE INDEX idx_posts_content_gin
ON posts USING GIN(to_tsvector('english', content));

-- GiST (geometric, full-text)
CREATE INDEX idx_locations_gist
ON locations USING GIST(coordinates);

-- Partial index (filtered)
CREATE INDEX idx_orders_pending
ON orders(customer_id)
WHERE status = 'pending';

-- Expression index
CREATE INDEX idx_users_email_domain
ON users((email ~~ '%@gmail.com%'));
sql
-- B-Tree (default, most common)
CREATE INDEX idx_users_email ON users(email);

-- Hash index (equality only, PostgreSQL)
CREATE INDEX idx_users_email_hash ON users USING HASH(email);

-- GIN (full-text search, arrays, JSONB)
CREATE INDEX idx_posts_content_gin
ON posts USING GIN(to_tsvector('english', content));

-- GiST (geometric, full-text)
CREATE INDEX idx_locations_gist
ON locations USING GIST(coordinates);

-- Partial index (filtered)
CREATE INDEX idx_orders_pending
ON orders(customer_id)
WHERE status = 'pending';

-- Expression index
CREATE INDEX idx_users_email_domain
ON users((email ~~ '%@gmail.com%'));

Composite Index Order

复合索引的顺序

sql
-- Index column order matters!
CREATE INDEX idx_orders_search
ON orders(status, customer_id, created_at);

-- ✅ Uses index (left-most column)
WHERE status = 'completed'

-- ✅ Uses index (left-most columns)
WHERE status = 'completed' AND customer_id = 123

-- ✅ Uses full index
WHERE status = 'completed'
    AND customer_id = 123
    AND created_at > '2024-01-01'

-- ❌ Doesn't use index (skips first column)
WHERE customer_id = 123

-- ❌ Doesn't use index (skips first column)
WHERE created_at > '2024-01-01'
sql
-- Index column order matters!
CREATE INDEX idx_orders_search
ON orders(status, customer_id, created_at);

-- ✅ Uses index (left-most column)
WHERE status = 'completed'

-- ✅ Uses index (left-most columns)
WHERE status = 'completed' AND customer_id = 123

-- ✅ Uses full index
WHERE status = 'completed'
    AND customer_id = 123
    AND created_at > '2024-01-01'

-- ❌ Doesn't use index (skips first column)
WHERE customer_id = 123

-- ❌ Doesn't use index (skips first column)
WHERE created_at > '2024-01-01'

MongoDB Queries

MongoDB查询

Find Queries

Find查询

javascript
// Basic find
db.users.find({ status: 'active' })

// Find with projection
db.users.find(
    { status: 'active' },
    { name: 1, email: 1, _id: 0 }
)

// Find with operators
db.orders.find({
    total: { $gt: 100, $lt: 1000 },
    status: { $in: ['pending', 'processing'] },
    'customer.city': 'New York'
})

// Find with sort and limit
db.products.find({ category: 'Electronics' })
    .sort({ price: -1 })
    .limit(10)

// Count
db.users.countDocuments({ status: 'active' })
javascript
// Basic find
db.users.find({ status: 'active' })

// Find with projection
db.users.find(
    { status: 'active' },
    { name: 1, email: 1, _id: 0 }
)

// Find with operators
db.orders.find({
    total: { $gt: 100, $lt: 1000 },
    status: { $in: ['pending', 'processing'] },
    'customer.city': 'New York'
})

// Find with sort and limit
db.products.find({ category: 'Electronics' })
    .sort({ price: -1 })
    .limit(10)

// Count
db.users.countDocuments({ status: 'active' })

Aggregation Pipeline

聚合管道

javascript
// Group and count
db.orders.aggregate([
    { $match: { status: 'completed' } },
    { $group: {
        _id: '$customer_id',
        total_orders: { $sum: 1 },
        total_spent: { $sum: '$total' },
        avg_order: { $avg: '$total' }
    }},
    { $sort: { total_spent: -1 } },
    { $limit: 10 }
])

// Lookup (JOIN)
db.orders.aggregate([
    { $lookup: {
        from: 'customers',
        localField: 'customer_id',
        foreignField: '_id',
        as: 'customer'
    }},
    { $unwind: '$customer' },
    { $project: {
        order_id: 1,
        total: 1,
        'customer.name': 1,
        'customer.email': 1
    }}
])

// Complex aggregation
db.sales.aggregate([
    // Filter
    { $match: {
        date: { $gte: ISODate('2024-01-01') }
    }},

    // Add computed fields
    { $addFields: {
        month: { $month: '$date' },
        year: { $year: '$date' }
    }},

    // Group by month
    { $group: {
        _id: { year: '$year', month: '$month' },
        total_sales: { $sum: '$amount' },
        order_count: { $sum: 1 },
        avg_sale: { $avg: '$amount' }
    }},

    // Sort
    { $sort: { '_id.year': 1, '_id.month': 1 } },

    // Reshape
    { $project: {
        _id: 0,
        date: {
            $concat: [
                { $toString: '$_id.year' },
                '-',
                { $toString: '$_id.month' }
            ]
        },
        total_sales: 1,
        order_count: 1,
        avg_sale: { $round: ['$avg_sale', 2] }
    }}
])
javascript
// Group and count
db.orders.aggregate([
    { $match: { status: 'completed' } },
    { $group: {
        _id: '$customer_id',
        total_orders: { $sum: 1 },
        total_spent: { $sum: '$total' },
        avg_order: { $avg: '$total' }
    }},
    { $sort: { total_spent: -1 } },
    { $limit: 10 }
])

// Lookup (JOIN)
db.orders.aggregate([
    { $lookup: {
        from: 'customers',
        localField: 'customer_id',
        foreignField: '_id',
        as: 'customer'
    }},
    { $unwind: '$customer' },
    { $project: {
        order_id: 1,
        total: 1,
        'customer.name': 1,
        'customer.email': 1
    }}
])

// Complex aggregation
db.sales.aggregate([
    // Filter
    { $match: {
        date: { $gte: ISODate('2024-01-01') }
    }},

    // Add computed fields
    { $addFields: {
        month: { $month: '$date' },
        year: { $year: '$date' }
    }},

    // Group by month
    { $group: {
        _id: { year: '$year', month: '$month' },
        total_sales: { $sum: '$amount' },
        order_count: { $sum: 1 },
        avg_sale: { $avg: '$amount' }
    }},

    // Sort
    { $sort: { '_id.year': 1, '_id.month': 1 } },

    // Reshape
    { $project: {
        _id: 0,
        date: {
            $concat: [
                { $toString: '$_id.year' },
                '-',
                { $toString: '$_id.month' }
            ]
        },
        total_sales: 1,
        order_count: 1,
        avg_sale: { $round: ['$avg_sale', 2] }
    }}
])

MongoDB Indexes

MongoDB索引

javascript
// Single field index
db.users.createIndex({ email: 1 })

// Compound index
db.orders.createIndex({ customer_id: 1, created_at: -1 })

// Unique index
db.users.createIndex({ email: 1 }, { unique: true })

// Partial index
db.orders.createIndex(
    { customer_id: 1 },
    { partialFilterExpression: { status: 'active' } }
)

// Text index
db.products.createIndex({ name: 'text', description: 'text' })

// TTL index (auto-delete after time)
db.sessions.createIndex(
    { created_at: 1 },
    { expireAfterSeconds: 3600 }
)

// List indexes
db.users.getIndexes()

// Analyze query performance
db.orders.find({ customer_id: 123 }).explain('executionStats')
javascript
// Single field index
db.users.createIndex({ email: 1 })

// Compound index
db.orders.createIndex({ customer_id: 1, created_at: -1 })

// Unique index
db.users.createIndex({ email: 1 }, { unique: true })

// Partial index
db.orders.createIndex(
    { customer_id: 1 },
    { partialFilterExpression: { status: 'active' } }
)

// Text index
db.products.createIndex({ name: 'text', description: 'text' })

// TTL index (auto-delete after time)
db.sessions.createIndex(
    { created_at: 1 },
    { expireAfterSeconds: 3600 }
)

// List indexes
db.users.getIndexes()

// Analyze query performance
db.orders.find({ customer_id: 123 }).explain('executionStats')

GraphQL Queries

GraphQL查询

graphql
undefined
graphql
undefined

Basic query

Basic query

query { users { id name email } }
query { users { id name email } }

Query with arguments

Query with arguments

query { user(id: "123") { name email orders { id total status } } }
query { user(id: "123") { name email orders { id total status } } }

Query with variables

Query with variables

query GetUser($userId: ID!) { user(id: $userId) { name email orders(limit: 10, status: COMPLETED) { id total createdAt } } }
query GetUser($userId: ID!) { user(id: $userId) { name email orders(limit: 10, status: COMPLETED) { id total createdAt } } }

Fragments (reusable fields)

Fragments (reusable fields)

fragment UserFields on User { id name email createdAt }
query { user(id: "123") { ...UserFields orders { id total } } }
fragment UserFields on User { id name email createdAt }
query { user(id: "123") { ...UserFields orders { id total } } }

Avoid N+1 queries with DataLoader

Avoid N+1 queries with DataLoader

query { orders { id total customer { # Batched by DataLoader name email } } }
undefined
query { orders { id total customer { # Batched by DataLoader name email } } }
undefined

Common Anti-Patterns

常见反模式

❌ N+1 Query Problem

❌ N+1查询问题

sql
-- Bad: N+1 queries
SELECT * FROM customers;  -- 1 query
-- Then for each customer:
SELECT * FROM orders WHERE customer_id = ?;  -- N queries

-- Good: Single JOIN query
SELECT
    c.customer_id,
    c.name,
    o.order_id,
    o.total
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id;
sql
-- Bad: N+1 queries
SELECT * FROM customers;  -- 1 query
-- Then for each customer:
SELECT * FROM orders WHERE customer_id = ?;  -- N queries

-- Good: Single JOIN query
SELECT
    c.customer_id,
    c.name,
    o.order_id,
    o.total
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id;

❌ Using OR on Different Columns

❌ 在不同列上使用OR

sql
-- Bad: Can't use indexes effectively
SELECT * FROM products
WHERE name = 'iPhone' OR category = 'Electronics';

-- Good: Use UNION
SELECT * FROM products WHERE name = 'iPhone'
UNION
SELECT * FROM products WHERE category = 'Electronics';
sql
-- Bad: Can't use indexes effectively
SELECT * FROM products
WHERE name = 'iPhone' OR category = 'Electronics';

-- Good: Use UNION
SELECT * FROM products WHERE name = 'iPhone'
UNION
SELECT * FROM products WHERE category = 'Electronics';

❌ Implicit Type Conversion

❌ 隐式类型转换

sql
-- Bad: '123' is string, user_id is integer
SELECT * FROM users WHERE user_id = '123';

-- Good: Use correct type
SELECT * FROM users WHERE user_id = 123;
sql
-- Bad: '123' is string, user_id is integer
SELECT * FROM users WHERE user_id = '123';

-- Good: Use correct type
SELECT * FROM users WHERE user_id = 123;

Query Performance Checklist

查询性能检查清单

  • Select only needed columns (no SELECT *)
  • Add indexes to WHERE/JOIN/ORDER BY columns
  • Use EXPLAIN to analyze query plan
  • Avoid functions on indexed columns
  • Use EXISTS instead of IN for subqueries
  • Batch INSERT/UPDATE operations
  • Use appropriate JOIN types
  • Filter early (WHERE before JOIN)
  • Use LIMIT for large result sets
  • Monitor slow query logs
  • Update statistics regularly
  • Avoid SELECT DISTINCT when possible
  • Use covering indexes when appropriate
  • 仅选择需要的列(不使用SELECT *)
  • 为WHERE/JOIN/ORDER BY列添加索引
  • 使用EXPLAIN分析查询计划
  • 避免在索引列上使用函数
  • 对子查询使用EXISTS替代IN
  • 批量执行INSERT/UPDATE操作
  • 使用合适的JOIN类型
  • 尽早过滤数据(JOIN前使用WHERE)
  • 对大数据集使用LIMIT
  • 监控慢查询日志
  • 定期更新统计信息
  • 尽可能避免使用SELECT DISTINCT
  • �适当使用覆盖索引

Resources

�资源


"Premature optimization is the root of all evil, but slow queries are the root of all frustration."