sql-code-review

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

SQL Code Review

SQL代码审查

Perform a thorough SQL code review of ${selection} (or entire project if no selection) focusing on security, performance, maintainability, and database best practices.
对${selection}(若未选中则针对整个项目)执行全面的SQL代码审查,重点关注安全性、性能、可维护性及数据库最佳实践。

🔒 Security Analysis

🔒 安全分析

SQL Injection Prevention

SQL注入防护

sql
-- ❌ CRITICAL: SQL Injection vulnerability
query = "SELECT * FROM users WHERE id = " + userInput;
query = f"DELETE FROM orders WHERE user_id = {user_id}";

-- ✅ SECURE: Parameterized queries
-- PostgreSQL/MySQL
PREPARE stmt FROM 'SELECT * FROM users WHERE id = ?';
EXECUTE stmt USING @user_id;

-- SQL Server
EXEC sp_executesql N'SELECT * FROM users WHERE id = @id', N'@id INT', @id = @user_id;
sql
-- ❌ CRITICAL: SQL Injection vulnerability
query = "SELECT * FROM users WHERE id = " + userInput;
query = f"DELETE FROM orders WHERE user_id = {user_id}";

-- ✅ SECURE: Parameterized queries
-- PostgreSQL/MySQL
PREPARE stmt FROM 'SELECT * FROM users WHERE id = ?';
EXECUTE stmt USING @user_id;

-- SQL Server
EXEC sp_executesql N'SELECT * FROM users WHERE id = @id', N'@id INT', @id = @user_id;

Access Control & Permissions

访问控制与权限

  • Principle of Least Privilege: Grant minimum required permissions
  • Role-Based Access: Use database roles instead of direct user permissions
  • Schema Security: Proper schema ownership and access controls
  • Function/Procedure Security: Review DEFINER vs INVOKER rights
  • 最小权限原则:仅授予所需的最低权限
  • 基于角色的访问:使用数据库角色而非直接为用户分配权限
  • 架构安全:合理设置架构所有权与访问控制
  • 函数/存储过程安全:审查DEFINER与INVOKER权限

Data Protection

数据保护

  • Sensitive Data Exposure: Avoid SELECT * on tables with sensitive columns
  • Audit Logging: Ensure sensitive operations are logged
  • Data Masking: Use views or functions to mask sensitive data
  • Encryption: Verify encrypted storage for sensitive data
  • 敏感数据暴露:避免在包含敏感列的表上使用SELECT *
  • 审计日志:确保敏感操作已被记录
  • 数据掩码:使用视图或函数对敏感数据进行掩码处理
  • 加密:验证敏感数据是否采用加密存储

⚡ Performance Optimization

⚡ 性能优化

Query Structure Analysis

查询结构分析

sql
-- ❌ BAD: Inefficient query patterns
SELECT DISTINCT u.* 
FROM users u, orders o, products p
WHERE u.id = o.user_id 
AND o.product_id = p.id
AND YEAR(o.order_date) = 2024;

-- ✅ GOOD: Optimized structure
SELECT u.id, u.name, u.email
FROM users u
INNER JOIN orders o ON u.id = o.user_id
WHERE o.order_date >= '2024-01-01' 
AND o.order_date < '2025-01-01';
sql
-- ❌ BAD: Inefficient query patterns
SELECT DISTINCT u.* 
FROM users u, orders o, products p
WHERE u.id = o.user_id 
AND o.product_id = p.id
AND YEAR(o.order_date) = 2024;

-- ✅ GOOD: Optimized structure
SELECT u.id, u.name, u.email
FROM users u
INNER JOIN orders o ON u.id = o.user_id
WHERE o.order_date >= '2024-01-01' 
AND o.order_date < '2025-01-01';

Index Strategy Review

索引策略审查

  • Missing Indexes: Identify columns that need indexing
  • Over-Indexing: Find unused or redundant indexes
  • Composite Indexes: Multi-column indexes for complex queries
  • Index Maintenance: Check for fragmented or outdated indexes
  • 缺失索引:识别需要添加索引的列
  • 过度索引:找出未使用或冗余的索引
  • 复合索引:为复杂查询创建多列索引
  • 索引维护:检查是否存在碎片化或过时的索引

Join Optimization

连接优化

  • Join Types: Verify appropriate join types (INNER vs LEFT vs EXISTS)
  • Join Order: Optimize for smaller result sets first
  • Cartesian Products: Identify and fix missing join conditions
  • Subquery vs JOIN: Choose the most efficient approach
  • 连接类型:验证是否使用了合适的连接类型(INNER、LEFT、EXISTS等)
  • 连接顺序:优先针对较小的结果集进行连接优化
  • 笛卡尔积:识别并修复缺失的连接条件
  • 子查询vs连接:选择最高效的实现方式

Aggregate and Window Functions

聚合与窗口函数

sql
-- ❌ BAD: Inefficient aggregation
SELECT user_id, 
       (SELECT COUNT(*) FROM orders o2 WHERE o2.user_id = o1.user_id) as order_count
FROM orders o1
GROUP BY user_id;

-- ✅ GOOD: Efficient aggregation
SELECT user_id, COUNT(*) as order_count
FROM orders
GROUP BY user_id;
sql
-- ❌ BAD: Inefficient aggregation
SELECT user_id, 
       (SELECT COUNT(*) FROM orders o2 WHERE o2.user_id = o1.user_id) as order_count
FROM orders o1
GROUP BY user_id;

-- ✅ GOOD: Efficient aggregation
SELECT user_id, COUNT(*) as order_count
FROM orders
GROUP BY user_id;

🛠️ Code Quality & Maintainability

🛠️ 代码质量与可维护性

SQL Style & Formatting

SQL风格与格式化

sql
-- ❌ BAD: Poor formatting and style
select u.id,u.name,o.total from users u left join orders o on u.id=o.user_id where u.status='active' and o.order_date>='2024-01-01';

-- ✅ GOOD: Clean, readable formatting
SELECT u.id,
       u.name,
       o.total
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.status = 'active'
  AND o.order_date >= '2024-01-01';
sql
-- ❌ BAD: Poor formatting and style
select u.id,u.name,o.total from users u left join orders o on u.id=o.user_id where u.status='active' and o.order_date>='2024-01-01';

-- ✅ GOOD: Clean, readable formatting
SELECT u.id,
       u.name,
       o.total
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.status = 'active'
  AND o.order_date >= '2024-01-01';

Naming Conventions

命名规范

  • Consistent Naming: Tables, columns, constraints follow consistent patterns
  • Descriptive Names: Clear, meaningful names for database objects
  • Reserved Words: Avoid using database reserved words as identifiers
  • Case Sensitivity: Consistent case usage across schema
  • 一致性命名:表、列、约束遵循统一的命名模式
  • 描述性名称:数据库对象使用清晰、有意义的名称
  • 保留字规避:避免使用数据库保留字作为标识符
  • 大小写一致性:整个架构中使用统一的大小写规则

Schema Design Review

架构设计审查

  • Normalization: Appropriate normalization level (avoid over/under-normalization)
  • Data Types: Optimal data type choices for storage and performance
  • Constraints: Proper use of PRIMARY KEY, FOREIGN KEY, CHECK, NOT NULL
  • Default Values: Appropriate default values for columns
  • 规范化:确保采用合适的规范化程度(避免过度或不足规范化)
  • 数据类型:选择最优的数据类型以提升存储与性能
  • 约束:合理使用PRIMARY KEY、FOREIGN KEY、CHECK、NOT NULL约束
  • 默认值:为列设置合适的默认值

🗄️ Database-Specific Best Practices

🗄️ 数据库专属最佳实践

PostgreSQL

PostgreSQL

sql
-- Use JSONB for JSON data
CREATE TABLE events (
    id SERIAL PRIMARY KEY,
    data JSONB NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- GIN index for JSONB queries
CREATE INDEX idx_events_data ON events USING gin(data);

-- Array types for multi-value columns
CREATE TABLE tags (
    post_id INT,
    tag_names TEXT[]
);
sql
-- Use JSONB for JSON data
CREATE TABLE events (
    id SERIAL PRIMARY KEY,
    data JSONB NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- GIN index for JSONB queries
CREATE INDEX idx_events_data ON events USING gin(data);

-- Array types for multi-value columns
CREATE TABLE tags (
    post_id INT,
    tag_names TEXT[]
);

MySQL

MySQL

sql
-- Use appropriate storage engines
CREATE TABLE sessions (
    id VARCHAR(128) PRIMARY KEY,
    data TEXT,
    expires TIMESTAMP
) ENGINE=InnoDB;

-- Optimize for InnoDB
ALTER TABLE large_table 
ADD INDEX idx_covering (status, created_at, id);
sql
-- Use appropriate storage engines
CREATE TABLE sessions (
    id VARCHAR(128) PRIMARY KEY,
    data TEXT,
    expires TIMESTAMP
) ENGINE=InnoDB;

-- Optimize for InnoDB
ALTER TABLE large_table 
ADD INDEX idx_covering (status, created_at, id);

SQL Server

SQL Server

sql
-- Use appropriate data types
CREATE TABLE products (
    id BIGINT IDENTITY(1,1) PRIMARY KEY,
    name NVARCHAR(255) NOT NULL,
    price DECIMAL(10,2) NOT NULL,
    created_at DATETIME2 DEFAULT GETUTCDATE()
);

-- Columnstore indexes for analytics
CREATE COLUMNSTORE INDEX idx_sales_cs ON sales;
sql
-- Use appropriate data types
CREATE TABLE products (
    id BIGINT IDENTITY(1,1) PRIMARY KEY,
    name NVARCHAR(255) NOT NULL,
    price DECIMAL(10,2) NOT NULL,
    created_at DATETIME2 DEFAULT GETUTCDATE()
);

-- Columnstore indexes for analytics
CREATE COLUMNSTORE INDEX idx_sales_cs ON sales;

Oracle

Oracle

sql
-- Use sequences for auto-increment
CREATE SEQUENCE user_id_seq START WITH 1 INCREMENT BY 1;

CREATE TABLE users (
    id NUMBER DEFAULT user_id_seq.NEXTVAL PRIMARY KEY,
    name VARCHAR2(255) NOT NULL
);
sql
-- Use sequences for auto-increment
CREATE SEQUENCE user_id_seq START WITH 1 INCREMENT BY 1;

CREATE TABLE users (
    id NUMBER DEFAULT user_id_seq.NEXTVAL PRIMARY KEY,
    name VARCHAR2(255) NOT NULL
);

🧪 Testing & Validation

🧪 测试与验证

Data Integrity Checks

数据完整性检查

sql
-- Verify referential integrity
SELECT o.user_id 
FROM orders o 
LEFT JOIN users u ON o.user_id = u.id 
WHERE u.id IS NULL;

-- Check for data consistency
SELECT COUNT(*) as inconsistent_records
FROM products 
WHERE price < 0 OR stock_quantity < 0;
sql
-- Verify referential integrity
SELECT o.user_id 
FROM orders o 
LEFT JOIN users u ON o.user_id = u.id 
WHERE u.id IS NULL;

-- Check for data consistency
SELECT COUNT(*) as inconsistent_records
FROM products 
WHERE price < 0 OR stock_quantity < 0;

Performance Testing

性能测试

  • Execution Plans: Review query execution plans
  • Load Testing: Test queries with realistic data volumes
  • Stress Testing: Verify performance under concurrent load
  • Regression Testing: Ensure optimizations don't break functionality
  • 执行计划:审查查询执行计划
  • 负载测试:使用真实数据量测试查询
  • 压力测试:验证并发负载下的性能表现
  • 回归测试:确保优化不会破坏现有功能

📊 Common Anti-Patterns

📊 常见反模式

N+1 Query Problem

N+1查询问题

sql
-- ❌ BAD: N+1 queries in application code
for user in users:
    orders = query("SELECT * FROM orders WHERE user_id = ?", user.id)

-- ✅ GOOD: Single optimized query
SELECT u.*, o.*
FROM users u
LEFT JOIN orders o ON u.id = o.user_id;
sql
-- ❌ BAD: N+1 queries in application code
for user in users:
    orders = query("SELECT * FROM orders WHERE user_id = ?", user.id)

-- ✅ GOOD: Single optimized query
SELECT u.*, o.*
FROM users u
LEFT JOIN orders o ON u.id = o.user_id;

Overuse of DISTINCT

过度使用DISTINCT

sql
-- ❌ BAD: DISTINCT masking join issues
SELECT DISTINCT u.name 
FROM users u, orders o 
WHERE u.id = o.user_id;

-- ✅ GOOD: Proper join without DISTINCT
SELECT u.name
FROM users u
INNER JOIN orders o ON u.id = o.user_id
GROUP BY u.name;
sql
-- ❌ BAD: DISTINCT masking join issues
SELECT DISTINCT u.name 
FROM users u, orders o 
WHERE u.id = o.user_id;

-- ✅ GOOD: Proper join without DISTINCT
SELECT u.name
FROM users u
INNER JOIN orders o ON u.id = o.user_id
GROUP BY u.name;

Function Misuse in WHERE Clauses

WHERE子句中滥用函数

sql
-- ❌ BAD: Functions prevent index usage
SELECT * FROM orders 
WHERE YEAR(order_date) = 2024;

-- ✅ GOOD: Range conditions use indexes
SELECT * FROM orders 
WHERE order_date >= '2024-01-01' 
  AND order_date < '2025-01-01';
sql
-- ❌ BAD: Functions prevent index usage
SELECT * FROM orders 
WHERE YEAR(order_date) = 2024;

-- ✅ GOOD: Range conditions use indexes
SELECT * FROM orders 
WHERE order_date >= '2024-01-01' 
  AND order_date < '2025-01-01';

📋 SQL Review Checklist

📋 SQL审查检查清单

Security

安全

  • All user inputs are parameterized
  • No dynamic SQL construction with string concatenation
  • Appropriate access controls and permissions
  • Sensitive data is properly protected
  • SQL injection attack vectors are eliminated
  • 所有用户输入均已参数化
  • 未使用字符串拼接构造动态SQL
  • 采用了合适的访问控制与权限设置
  • 敏感数据已得到妥善保护
  • 已消除SQL注入攻击向量

Performance

性能

  • Indexes exist for frequently queried columns
  • No unnecessary SELECT * statements
  • JOINs are optimized and use appropriate types
  • WHERE clauses are selective and use indexes
  • Subqueries are optimized or converted to JOINs
  • 为频繁查询的列创建了索引
  • 无不必要的SELECT *语句
  • 连接已优化且使用了合适的类型
  • WHERE子句具备选择性且使用了索引
  • 子查询已优化或转换为连接

Code Quality

代码质量

  • Consistent naming conventions
  • Proper formatting and indentation
  • Meaningful comments for complex logic
  • Appropriate data types are used
  • Error handling is implemented
  • 遵循一致的命名规范
  • 采用了正确的格式与缩进
  • 复杂逻辑配有有意义的注释
  • 使用了合适的数据类型
  • 已实现错误处理

Schema Design

架构设计

  • Tables are properly normalized
  • Constraints enforce data integrity
  • Indexes support query patterns
  • Foreign key relationships are defined
  • Default values are appropriate
  • 表已进行合理规范化
  • 约束可确保数据完整性
  • 索引可支持查询模式
  • 已定义外键关系
  • 默认值设置合理

🎯 Review Output Format

🎯 审查输出格式

Issue Template

问题模板

undefined
undefined

[PRIORITY] [CATEGORY]: [Brief Description]

[优先级] [类别]: [简要描述]

Location: [Table/View/Procedure name and line number if applicable] Issue: [Detailed explanation of the problem] Security Risk: [If applicable - injection risk, data exposure, etc.] Performance Impact: [Query cost, execution time impact] Recommendation: [Specific fix with code example]
Before:
sql
-- Problematic SQL
After:
sql
-- Improved SQL
Expected Improvement: [Performance gain, security benefit]
undefined
位置: [表/视图/存储过程名称及行号(若适用)] 问题: [对问题的详细说明] 安全风险: [若适用 - 注入风险、数据暴露等] 性能影响: [查询成本、执行时间影响] 建议: [具体修复方案及代码示例]
修复前:
sql
-- 存在问题的SQL
修复后:
sql
-- 优化后的SQL
预期改进: [性能提升、安全收益]
undefined

Summary Assessment

总结评估

  • Security Score: [1-10] - SQL injection protection, access controls
  • Performance Score: [1-10] - Query efficiency, index usage
  • Maintainability Score: [1-10] - Code quality, documentation
  • Schema Quality Score: [1-10] - Design patterns, normalization
  • 安全评分: [1-10] - SQL注入防护、访问控制
  • 性能评分: [1-10] - 查询效率、索引使用
  • 可维护性评分: [1-10] - 代码质量、文档
  • 架构质量评分: [1-10] - 设计模式、规范化

Top 3 Priority Actions

三大优先行动项

  1. [Critical Security Fix]: Address SQL injection vulnerabilities
  2. [Performance Optimization]: Add missing indexes or optimize queries
  3. [Code Quality]: Improve naming conventions and documentation
Focus on providing actionable, database-agnostic recommendations while highlighting platform-specific optimizations and best practices.
  1. [关键安全修复]: 解决SQL注入漏洞
  2. [性能优化]: 添加缺失的索引或优化查询
  3. [代码质量]: 改进命名规范与文档
重点提供可落地的、数据库无关的建议,同时突出平台专属的优化方案与最佳实践。