django-perf-review

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Django Performance Review

Django性能审查

Review Django code for validated performance issues. Research the codebase to confirm issues before reporting. Report only what you can prove.
针对已验证的性能问题审查Django代码。在报告前需研究代码库以确认问题,仅报告可被证实的内容。

Review Approach

审查方法

  1. Research first - Trace data flow, check for existing optimizations, verify data volume
  2. Validate before reporting - Pattern matching is not validation
  3. Zero findings is acceptable - Don't manufacture issues to appear thorough
  4. Severity must match impact - If you catch yourself writing "minor" in a CRITICAL finding, it's not critical. Downgrade or skip it.
  1. 先调研 - 追踪数据流,检查现有优化措施,验证数据量
  2. 验证后再报告 - 模式匹配不等于验证
  3. 无问题也可接受 - 不要为了显得全面而编造问题
  4. 严重程度需匹配影响 - 如果你发现自己在「严重」级别的问题里写了「轻微」,那它就不是严重问题。请降级或忽略。

Impact Categories

影响分类

Issues are organized by impact. Focus on CRITICAL and HIGH - these cause real problems at scale.
PriorityCategoryImpact
1N+1 QueriesCRITICAL - Multiplies with data, causes timeouts
2Unbounded QuerysetsCRITICAL - Memory exhaustion, OOM kills
3Missing IndexesHIGH - Full table scans on large tables
4Write LoopsHIGH - Lock contention, slow requests
5Inefficient PatternsLOW - Rarely worth reporting

问题按影响程度划分。重点关注「严重」和「高」级别 - 这些问题在大规模场景下会引发实际故障。
优先级类别影响
1N+1查询严重 - 随数据量倍增,会导致请求超时
2无限制Queryset严重 - 内存耗尽,触发OOM终止进程
3缺失索引 - 大表执行全表扫描
4写入循环 - 锁竞争,请求响应缓慢
5低效模式 - 通常无需报告

Priority 1: N+1 Queries (CRITICAL)

优先级1:N+1查询(严重)

Impact: Each N+1 adds
O(n)
database round trips. 100 rows = 100 extra queries. 10,000 rows = timeout.
影响: 每个N+1查询会增加
O(n)
次数据库往返。100行数据=100次额外查询,10000行数据=请求超时。

Rule: Prefetch related data accessed in loops

规则:预取循环中访问的关联数据

Validate by tracing: View → Queryset → Template/Serializer → Loop access
python
undefined
通过追踪验证:视图 → Queryset → 模板/序列化器 → 循环访问
python
undefined

PROBLEM: N+1 - each iteration queries profile

问题:N+1查询 - 每次循环都会查询profile

def user_list(request): users = User.objects.all() return render(request, 'users.html', {'users': users})
def user_list(request): users = User.objects.all() return render(request, 'users.html', {'users': users})

Template:

模板:

{% for user in users %}

{% for user in users %}

{{ user.profile.bio }} ← triggers query per user

{{ user.profile.bio }} ← 每个用户都会触发一次查询

{% endfor %}

{% endfor %}

SOLUTION: Prefetch in view

解决方案:在视图中预取数据

def user_list(request): users = User.objects.select_related('profile') return render(request, 'users.html', {'users': users})
undefined
def user_list(request): users = User.objects.select_related('profile') return render(request, 'users.html', {'users': users})
undefined

Rule: Prefetch in serializers, not just views

规则:在序列化器中预取数据,而不仅仅是视图

DRF serializers accessing related fields cause N+1 if queryset isn't optimized.
python
undefined
如果Queryset未优化,DRF序列化器访问关联字段会导致N+1查询。
python
undefined

PROBLEM: SerializerMethodField queries per object

问题:SerializerMethodField会为每个对象触发查询

class UserSerializer(serializers.ModelSerializer): order_count = serializers.SerializerMethodField()
def get_order_count(self, obj):
    return obj.orders.count()  # ← query per user
class UserSerializer(serializers.ModelSerializer): order_count = serializers.SerializerMethodField()
def get_order_count(self, obj):
    return obj.orders.count()  # ← 每个用户都会触发一次查询

SOLUTION: Annotate in viewset, access in serializer

解决方案:在视图集中添加注解,在序列化器中访问

class UserViewSet(viewsets.ModelViewSet): def get_queryset(self): return User.objects.annotate(order_count=Count('orders'))
class UserSerializer(serializers.ModelSerializer): order_count = serializers.IntegerField(read_only=True)
undefined
class UserViewSet(viewsets.ModelViewSet): def get_queryset(self): return User.objects.annotate(order_count=Count('orders'))
class UserSerializer(serializers.ModelSerializer): order_count = serializers.IntegerField(read_only=True)
undefined

Rule: Model properties that query are dangerous in loops

规则:模型中包含查询逻辑的属性在循环中使用很危险

python
undefined
python
undefined

PROBLEM: Property triggers query when accessed

问题:访问该属性时会触发查询

class User(models.Model): @property def recent_orders(self): return self.orders.filter(created__gte=last_week)[:5]
class User(models.Model): @property def recent_orders(self): return self.orders.filter(created__gte=last_week)[:5]

Used in template loop = N+1

在模板循环中使用会导致N+1查询

SOLUTION: Use Prefetch with custom queryset, or annotate

解决方案:使用带自定义Queryset的Prefetch,或添加注解

undefined
undefined

Validation Checklist for N+1

N+1查询验证清单

  • Traced data flow from view to template/serializer
  • Confirmed related field is accessed inside a loop
  • Searched codebase for existing select_related/prefetch_related
  • Verified table has significant row count (1000+)
  • Confirmed this is a hot path (not admin, not rare action)

  • 追踪了从视图到模板/序列化器的数据流
  • 确认关联字段在循环中被访问
  • 搜索代码库中是否已存在select_related/prefetch_related
  • 验证表中有大量数据(1000行以上)
  • 确认这是高频路径(不是后台管理页面或罕见操作)

Priority 2: Unbounded Querysets (CRITICAL)

优先级2:无限制Queryset(严重)

Impact: Loading entire tables exhausts memory. Large tables cause OOM kills and worker restarts.
影响: 加载整张表会耗尽内存。大表会导致OOM终止进程和工作节点重启。

Rule: Always paginate list endpoints

规则:列表端点始终要分页

python
undefined
python
undefined

PROBLEM: No pagination - loads all rows

问题:无分页 - 加载所有行数据

class UserListView(ListView): model = User template_name = 'users.html'
class UserListView(ListView): model = User template_name = 'users.html'

SOLUTION: Add pagination

解决方案:添加分页

class UserListView(ListView): model = User template_name = 'users.html' paginate_by = 25
undefined
class UserListView(ListView): model = User template_name = 'users.html' paginate_by = 25
undefined

Rule: Use iterator() for large batch processing

规则:处理大量数据时使用iterator()

python
undefined
python
undefined

PROBLEM: Loads all objects into memory at once

问题:一次性将所有对象加载到内存中

for user in User.objects.all(): process(user)
for user in User.objects.all(): process(user)

SOLUTION: Stream with iterator()

解决方案:使用iterator()流式处理

for user in User.objects.iterator(chunk_size=1000): process(user)
undefined
for user in User.objects.iterator(chunk_size=1000): process(user)
undefined

Rule: Never call list() on unbounded querysets

规则:永远不要在无限制Queryset上调用list()

python
undefined
python
undefined

PROBLEM: Forces full evaluation into memory

问题:强制将所有数据加载到内存中

all_users = list(User.objects.all())
all_users = list(User.objects.all())

SOLUTION: Keep as queryset, slice if needed

解决方案:保持为Queryset,按需切片

users = User.objects.all()[:100]
undefined
users = User.objects.all()[:100]
undefined

Validation Checklist for Unbounded Querysets

无限制Queryset验证清单

  • Table is large (10k+ rows) or will grow unbounded
  • No pagination class, paginate_by, or slicing
  • This runs on user-facing request (not background job with chunking)

  • 表数据量大(1万行以上)或会无限制增长
  • 没有分页类、paginate_by或切片处理
  • 该代码运行在面向用户的请求中(不是带分块处理的后台任务)

Priority 3: Missing Indexes (HIGH)

优先级3:缺失索引(高)

Impact: Full table scans. Negligible on small tables, catastrophic on large ones.
影响: 全表扫描。在小表中影响可忽略,在大表中会引发严重问题。

Rule: Index fields used in WHERE clauses on large tables

规则:大表中WHERE子句使用的字段要加索引

python
undefined
python
undefined

PROBLEM: Filtering on unindexed field

问题:在未加索引的字段上过滤

User.objects.filter(email=email) # full scan if no index

User.objects.filter(email=email) # 无索引时会执行全表扫描

class User(models.Model): email = models.EmailField() # ← no db_index
class User(models.Model): email = models.EmailField() # ← 未设置db_index

SOLUTION: Add index

解决方案:添加索引

class User(models.Model): email = models.EmailField(db_index=True)
undefined
class User(models.Model): email = models.EmailField(db_index=True)
undefined

Rule: Index fields used in ORDER BY on large tables

规则:大表中ORDER BY使用的字段要加索引

python
undefined
python
undefined

PROBLEM: Sorting requires full scan without index

问题:无索引时排序需要全表扫描

Order.objects.order_by('-created')
Order.objects.order_by('-created')

SOLUTION: Index the sort field

解决方案:为排序字段添加索引

class Order(models.Model): created = models.DateTimeField(db_index=True)
undefined
class Order(models.Model): created = models.DateTimeField(db_index=True)
undefined

Rule: Use composite indexes for common query patterns

规则:针对常见查询模式使用复合索引

python
class Order(models.Model):
    user = models.ForeignKey(User)
    status = models.CharField(max_length=20)
    created = models.DateTimeField()

    class Meta:
        indexes = [
            models.Index(fields=['user', 'status']),  # for filter(user=x, status=y)
            models.Index(fields=['status', '-created']),  # for filter(status=x).order_by('-created')
        ]
python
class Order(models.Model):
    user = models.ForeignKey(User)
    status = models.CharField(max_length=20)
    created = models.DateTimeField()

    class Meta:
        indexes = [
            models.Index(fields=['user', 'status']),  # 用于filter(user=x, status=y)
            models.Index(fields=['status', '-created']),  # 用于filter(status=x).order_by('-created')
        ]

Validation Checklist for Missing Indexes

缺失索引验证清单

  • Table has 10k+ rows
  • Field is used in filter() or order_by() on hot path
  • Checked model - no db_index=True or Meta.indexes entry
  • Not a foreign key (already indexed automatically)

  • 表数据量在1万行以上
  • 该字段在高频路径的filter()或order_by()中使用
  • 检查模型 - 无db_index=True或Meta.indexes配置
  • 不是外键(外键会自动创建索引)

Priority 4: Write Loops (HIGH)

优先级4:写入循环(高)

Impact: N database writes instead of 1. Lock contention. Slow requests.
影响: N次数据库写入而非1次。引发锁竞争,请求响应缓慢。

Rule: Use bulk_create instead of create() in loops

规则:在循环中使用bulk_create替代create()

python
undefined
python
undefined

PROBLEM: N inserts, N round trips

问题:N次插入,N次数据库往返

for item in items: Model.objects.create(name=item['name'])
for item in items: Model.objects.create(name=item['name'])

SOLUTION: Single bulk insert

解决方案:单次批量插入

Model.objects.bulk_create([ Model(name=item['name']) for item in items ])
undefined
Model.objects.bulk_create([ Model(name=item['name']) for item in items ])
undefined

Rule: Use update() or bulk_update instead of save() in loops

规则:在循环中使用update()或bulk_update替代save()

python
undefined
python
undefined

PROBLEM: N updates

问题:N次更新

for obj in queryset: obj.status = 'done' obj.save()
for obj in queryset: obj.status = 'done' obj.save()

SOLUTION A: Single UPDATE statement (same value for all)

解决方案A:单次UPDATE语句(所有对象设置相同值)

queryset.update(status='done')
queryset.update(status='done')

SOLUTION B: bulk_update (different values)

解决方案B:bulk_update(对象值不同时)

for obj in objects: obj.status = compute_status(obj) Model.objects.bulk_update(objects, ['status'], batch_size=500)
undefined
for obj in objects: obj.status = compute_status(obj) Model.objects.bulk_update(objects, ['status'], batch_size=500)
undefined

Rule: Use delete() on queryset, not in loops

规则:在Queryset上直接调用delete(),而非在循环中删除

python
undefined
python
undefined

PROBLEM: N deletes

问题:N次删除

for obj in queryset: obj.delete()
for obj in queryset: obj.delete()

SOLUTION: Single DELETE

解决方案:单次DELETE操作

queryset.delete()
undefined
queryset.delete()
undefined

Validation Checklist for Write Loops

写入循环验证清单

  • Loop iterates over 100+ items (or unbounded)
  • Each iteration calls create(), save(), or delete()
  • This runs on user-facing request (not one-time migration script)

  • 循环迭代100个以上的项(或无限制)
  • 每次循环调用create()、save()或delete()
  • 该代码运行在面向用户的请求中(不是一次性迁移脚本)

Priority 5: Inefficient Patterns (LOW)

优先级5:低效模式(低)

Rarely worth reporting. Include only as minor notes if you're already reporting real issues.
通常无需报告。仅当你已报告真实问题时,可作为次要备注提及。

Pattern: count() vs exists()

模式:count() vs exists()

python
undefined
python
undefined

Slightly suboptimal

略有优化空间

if queryset.count() > 0: do_thing()
if queryset.count() > 0: do_thing()

Marginally better

稍好一些

if queryset.exists(): do_thing()

**Usually skip** - difference is <1ms in most cases.
if queryset.exists(): do_thing()

**通常可以忽略** - 大多数情况下差异小于1毫秒。

Pattern: len(queryset) vs count()

模式:len(queryset) vs count()

python
undefined
python
undefined

Fetches all rows to count

会获取所有行数据来计数

if len(queryset) > 0: # bad if queryset not yet evaluated
if len(queryset) > 0: # 如果Queryset尚未求值,这种方式很差

Single COUNT query

单次COUNT查询

if queryset.count() > 0:

**Only flag** if queryset is large and not already evaluated.
if queryset.count() > 0:

**仅在以下情况标记**:Queryset数据量大且尚未被求值。

Pattern: get() in small loops

模式:小循环中使用get()

python
undefined
python
undefined

N queries, but if N is small (< 20), often fine

N次查询,但如果N很小(<20),通常没问题

for id in ids: obj = Model.objects.get(id=id)

**Only flag** if loop is large or this is in a very hot path.

---
for id in ids: obj = Model.objects.get(id=id)

**仅在以下情况标记**:循环数据量大或这是高频路径。

---

Validation Requirements

验证要求

Before reporting ANY issue:
  1. Trace the data flow - Follow queryset from creation to consumption
  2. Search for existing optimizations - Grep for select_related, prefetch_related, pagination
  3. Verify data volume - Check if table is actually large
  4. Confirm hot path - Trace call sites, verify this runs frequently
  5. Rule out mitigations - Check for caching, rate limiting
If you cannot validate all steps, do not report.

在报告任何问题前:
  1. 追踪数据流 - 跟随Queryset从创建到使用的全过程
  2. 搜索现有优化 - 查找select_related、prefetch_related、分页等相关代码
  3. 验证数据量 - 检查表是否真的很大
  4. 确认高频路径 - 追踪调用位置,验证该代码是否频繁运行
  5. 排除缓解措施 - 检查是否有缓存、限流等机制
如果无法完成所有验证步骤,请勿报告。

Output Format

输出格式

markdown
undefined
markdown
undefined

Django Performance Review: [File/Component Name]

Django性能审查:[文件/组件名称]

Summary

摘要

Validated issues: X (Y Critical, Z High)
已验证问题:X个(Y个严重,Z个高优先级)

Findings

发现的问题

[PERF-001] N+1 Query in UserListView (CRITICAL)

[PERF-001] UserListView中的N+1查询(严重)

Location:
views.py:45
Issue: Related field
profile
accessed in template loop without prefetch.
Validation:
  • Traced: UserListView → users queryset → user_list.html →
    {{ user.profile.bio }}
    in loop
  • Searched codebase: no select_related('profile') found
  • User table: 50k+ rows (verified in admin)
  • Hot path: linked from homepage navigation
Evidence:
python
def get_queryset(self):
    return User.objects.filter(active=True)  # no select_related
Fix:
python
def get_queryset(self):
    return User.objects.filter(active=True).select_related('profile')

If no issues found: "No performance issues identified after reviewing [files] and validating [what you checked]."

**Before submitting, sanity check each finding:**
- Does the severity match the actual impact? ("Minor inefficiency" ≠ CRITICAL)
- Is this a real performance issue or just a style preference?
- Would fixing this measurably improve performance?

If the answer to any is "no" - remove the finding.

---
位置:
views.py:45
问题: 在模板循环中访问关联字段
profile
但未预取数据。
验证:
  • 追踪路径:UserListView → users queryset → user_list.html → 循环中使用
    {{ user.profile.bio }}
  • 代码库搜索:未找到select_related('profile')
  • 用户表:5万+行数据(在后台管理中验证)
  • 高频路径:从首页导航可访问
证据:
python
def get_queryset(self):
    return User.objects.filter(active=True)  # 无select_related
修复方案:
python
def get_queryset(self):
    return User.objects.filter(active=True).select_related('profile')

如果未发现问题:「在审查[文件]并验证[检查内容]后,未发现性能问题。」

**提交前,请对每个发现的问题进行合理性检查:**
- 严重程度是否与实际影响匹配?(「轻微低效」≠ 严重)
- 这是真实的性能问题还是只是风格偏好?
- 修复后是否能显著提升性能?

如果任何一个问题的答案是「否」 - 请移除该问题。

---

What NOT to Report

请勿报告的内容

  • Test files
  • Admin-only views
  • Management commands
  • Migration files
  • One-time scripts
  • Code behind disabled feature flags
  • Tables with <1000 rows that won't grow
  • Patterns in cold paths (rarely executed code)
  • Micro-optimizations (exists vs count, only/defer without evidence)
  • 测试文件
  • 仅后台管理使用的视图
  • 管理命令
  • 迁移文件
  • 一次性脚本
  • 禁用功能标记后的代码
  • 数据量<1000行且不会增长的表
  • 低频路径中的模式(很少执行的代码)
  • 微优化(exists vs count,无证据的only/defer)

False Positives to Avoid

需要避免的误报

Queryset variable assignment is not an issue:
python
undefined
Queryset变量赋值不是问题:
python
undefined

This is FINE - no performance difference

这样没问题 - 性能无差异

projects_qs = Project.objects.filter(org=org) projects = list(projects_qs)
projects_qs = Project.objects.filter(org=org) projects = list(projects_qs)

vs this - identical performance

和下面的代码性能完全相同

projects = list(Project.objects.filter(org=org))
Querysets are lazy. Assigning to a variable doesn't execute anything.

**Single query patterns are not N+1:**
```python
projects = list(Project.objects.filter(org=org))
Queryset是惰性的。赋值给变量不会触发执行。

**单次查询模式不是N+1查询:**
```python

This is ONE query, not N+1

这是1次查询,不是N+1

projects = list(Project.objects.filter(org=org))
N+1 requires a loop that triggers additional queries. A single `list()` call is fine.

**Missing select_related on single object fetch is not N+1:**
```python
projects = list(Project.objects.filter(org=org))
N+1查询需要循环触发额外查询。单次`list()`调用是没问题的。

**获取单个对象时缺失select_related不是N+1查询:**
```python

This is 2 queries, not N+1 - report as LOW at most

这是2次查询,不是N+1 - 最多标记为低优先级

state = AutofixState.objects.filter(pr_id=pr_id).first() project_id = state.request.project_id # second query
N+1 requires a loop. A single object doing 2 queries instead of 1 can be reported as LOW if relevant, but never as CRITICAL/HIGH.

**Style preferences are not performance issues:**
If your only suggestion is "combine these two lines" or "rename this variable" - that's style, not performance. Don't report it.
state = AutofixState.objects.filter(pr_id=pr_id).first() project_id = state.request.project_id # 第二次查询
N+1查询需要循环。单个对象执行2次查询而非1次,若相关可标记为低优先级,但永远不能标记为严重/高优先级。

**风格偏好不是性能问题:**
如果你唯一的建议是「合并这两行代码」或「重命名这个变量」 - 这是风格问题,不是性能问题。请勿报告。