django-perf-review
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDjango Performance Review
Django性能审查
Review Django code for validated performance issues. Research the codebase to confirm issues before reporting. Report only what you can prove.
针对已验证的性能问题审查Django代码。在报告前需研究代码库以确认问题,仅报告可被证实的内容。
Review Approach
审查方法
- Research first - Trace data flow, check for existing optimizations, verify data volume
- Validate before reporting - Pattern matching is not validation
- Zero findings is acceptable - Don't manufacture issues to appear thorough
- Severity must match impact - If you catch yourself writing "minor" in a CRITICAL finding, it's not critical. Downgrade or skip it.
- 先调研 - 追踪数据流,检查现有优化措施,验证数据量
- 验证后再报告 - 模式匹配不等于验证
- 无问题也可接受 - 不要为了显得全面而编造问题
- 严重程度需匹配影响 - 如果你发现自己在「严重」级别的问题里写了「轻微」,那它就不是严重问题。请降级或忽略。
Impact Categories
影响分类
Issues are organized by impact. Focus on CRITICAL and HIGH - these cause real problems at scale.
| Priority | Category | Impact |
|---|---|---|
| 1 | N+1 Queries | CRITICAL - Multiplies with data, causes timeouts |
| 2 | Unbounded Querysets | CRITICAL - Memory exhaustion, OOM kills |
| 3 | Missing Indexes | HIGH - Full table scans on large tables |
| 4 | Write Loops | HIGH - Lock contention, slow requests |
| 5 | Inefficient Patterns | LOW - Rarely worth reporting |
问题按影响程度划分。重点关注「严重」和「高」级别 - 这些问题在大规模场景下会引发实际故障。
| 优先级 | 类别 | 影响 |
|---|---|---|
| 1 | N+1查询 | 严重 - 随数据量倍增,会导致请求超时 |
| 2 | 无限制Queryset | 严重 - 内存耗尽,触发OOM终止进程 |
| 3 | 缺失索引 | 高 - 大表执行全表扫描 |
| 4 | 写入循环 | 高 - 锁竞争,请求响应缓慢 |
| 5 | 低效模式 | 低 - 通常无需报告 |
Priority 1: N+1 Queries (CRITICAL)
优先级1:N+1查询(严重)
Impact: Each N+1 adds database round trips. 100 rows = 100 extra queries. 10,000 rows = timeout.
O(n)影响: 每个N+1查询会增加次数据库往返。100行数据=100次额外查询,10000行数据=请求超时。
O(n)Rule: Prefetch related data accessed in loops
规则:预取循环中访问的关联数据
Validate by tracing: View → Queryset → Template/Serializer → Loop access
python
undefined通过追踪验证:视图 → Queryset → 模板/序列化器 → 循环访问
python
undefinedPROBLEM: N+1 - each iteration queries profile
问题:N+1查询 - 每次循环都会查询profile
def user_list(request):
users = User.objects.all()
return render(request, 'users.html', {'users': users})
def user_list(request):
users = User.objects.all()
return render(request, 'users.html', {'users': users})
Template:
模板:
{% for user in users %}
{% for user in users %}
{{ user.profile.bio }} ← triggers query per user
{{ user.profile.bio }} ← 每个用户都会触发一次查询
{% endfor %}
{% endfor %}
SOLUTION: Prefetch in view
解决方案:在视图中预取数据
def user_list(request):
users = User.objects.select_related('profile')
return render(request, 'users.html', {'users': users})
undefineddef user_list(request):
users = User.objects.select_related('profile')
return render(request, 'users.html', {'users': users})
undefinedRule: Prefetch in serializers, not just views
规则:在序列化器中预取数据,而不仅仅是视图
DRF serializers accessing related fields cause N+1 if queryset isn't optimized.
python
undefined如果Queryset未优化,DRF序列化器访问关联字段会导致N+1查询。
python
undefinedPROBLEM: SerializerMethodField queries per object
问题:SerializerMethodField会为每个对象触发查询
class UserSerializer(serializers.ModelSerializer):
order_count = serializers.SerializerMethodField()
def get_order_count(self, obj):
return obj.orders.count() # ← query per userclass UserSerializer(serializers.ModelSerializer):
order_count = serializers.SerializerMethodField()
def get_order_count(self, obj):
return obj.orders.count() # ← 每个用户都会触发一次查询SOLUTION: Annotate in viewset, access in serializer
解决方案:在视图集中添加注解,在序列化器中访问
class UserViewSet(viewsets.ModelViewSet):
def get_queryset(self):
return User.objects.annotate(order_count=Count('orders'))
class UserSerializer(serializers.ModelSerializer):
order_count = serializers.IntegerField(read_only=True)
undefinedclass UserViewSet(viewsets.ModelViewSet):
def get_queryset(self):
return User.objects.annotate(order_count=Count('orders'))
class UserSerializer(serializers.ModelSerializer):
order_count = serializers.IntegerField(read_only=True)
undefinedRule: Model properties that query are dangerous in loops
规则:模型中包含查询逻辑的属性在循环中使用很危险
python
undefinedpython
undefinedPROBLEM: Property triggers query when accessed
问题:访问该属性时会触发查询
class User(models.Model):
@property
def recent_orders(self):
return self.orders.filter(created__gte=last_week)[:5]
class User(models.Model):
@property
def recent_orders(self):
return self.orders.filter(created__gte=last_week)[:5]
Used in template loop = N+1
在模板循环中使用会导致N+1查询
SOLUTION: Use Prefetch with custom queryset, or annotate
解决方案:使用带自定义Queryset的Prefetch,或添加注解
undefinedundefinedValidation Checklist for N+1
N+1查询验证清单
- Traced data flow from view to template/serializer
- Confirmed related field is accessed inside a loop
- Searched codebase for existing select_related/prefetch_related
- Verified table has significant row count (1000+)
- Confirmed this is a hot path (not admin, not rare action)
- 追踪了从视图到模板/序列化器的数据流
- 确认关联字段在循环中被访问
- 搜索代码库中是否已存在select_related/prefetch_related
- 验证表中有大量数据(1000行以上)
- 确认这是高频路径(不是后台管理页面或罕见操作)
Priority 2: Unbounded Querysets (CRITICAL)
优先级2:无限制Queryset(严重)
Impact: Loading entire tables exhausts memory. Large tables cause OOM kills and worker restarts.
影响: 加载整张表会耗尽内存。大表会导致OOM终止进程和工作节点重启。
Rule: Always paginate list endpoints
规则:列表端点始终要分页
python
undefinedpython
undefinedPROBLEM: No pagination - loads all rows
问题:无分页 - 加载所有行数据
class UserListView(ListView):
model = User
template_name = 'users.html'
class UserListView(ListView):
model = User
template_name = 'users.html'
SOLUTION: Add pagination
解决方案:添加分页
class UserListView(ListView):
model = User
template_name = 'users.html'
paginate_by = 25
undefinedclass UserListView(ListView):
model = User
template_name = 'users.html'
paginate_by = 25
undefinedRule: Use iterator() for large batch processing
规则:处理大量数据时使用iterator()
python
undefinedpython
undefinedPROBLEM: Loads all objects into memory at once
问题:一次性将所有对象加载到内存中
for user in User.objects.all():
process(user)
for user in User.objects.all():
process(user)
SOLUTION: Stream with iterator()
解决方案:使用iterator()流式处理
for user in User.objects.iterator(chunk_size=1000):
process(user)
undefinedfor user in User.objects.iterator(chunk_size=1000):
process(user)
undefinedRule: Never call list() on unbounded querysets
规则:永远不要在无限制Queryset上调用list()
python
undefinedpython
undefinedPROBLEM: Forces full evaluation into memory
问题:强制将所有数据加载到内存中
all_users = list(User.objects.all())
all_users = list(User.objects.all())
SOLUTION: Keep as queryset, slice if needed
解决方案:保持为Queryset,按需切片
users = User.objects.all()[:100]
undefinedusers = User.objects.all()[:100]
undefinedValidation Checklist for Unbounded Querysets
无限制Queryset验证清单
- Table is large (10k+ rows) or will grow unbounded
- No pagination class, paginate_by, or slicing
- This runs on user-facing request (not background job with chunking)
- 表数据量大(1万行以上)或会无限制增长
- 没有分页类、paginate_by或切片处理
- 该代码运行在面向用户的请求中(不是带分块处理的后台任务)
Priority 3: Missing Indexes (HIGH)
优先级3:缺失索引(高)
Impact: Full table scans. Negligible on small tables, catastrophic on large ones.
影响: 全表扫描。在小表中影响可忽略,在大表中会引发严重问题。
Rule: Index fields used in WHERE clauses on large tables
规则:大表中WHERE子句使用的字段要加索引
python
undefinedpython
undefinedPROBLEM: Filtering on unindexed field
问题:在未加索引的字段上过滤
User.objects.filter(email=email) # full scan if no index
User.objects.filter(email=email) # 无索引时会执行全表扫描
class User(models.Model):
email = models.EmailField() # ← no db_index
class User(models.Model):
email = models.EmailField() # ← 未设置db_index
SOLUTION: Add index
解决方案:添加索引
class User(models.Model):
email = models.EmailField(db_index=True)
undefinedclass User(models.Model):
email = models.EmailField(db_index=True)
undefinedRule: Index fields used in ORDER BY on large tables
规则:大表中ORDER BY使用的字段要加索引
python
undefinedpython
undefinedPROBLEM: Sorting requires full scan without index
问题:无索引时排序需要全表扫描
Order.objects.order_by('-created')
Order.objects.order_by('-created')
SOLUTION: Index the sort field
解决方案:为排序字段添加索引
class Order(models.Model):
created = models.DateTimeField(db_index=True)
undefinedclass Order(models.Model):
created = models.DateTimeField(db_index=True)
undefinedRule: Use composite indexes for common query patterns
规则:针对常见查询模式使用复合索引
python
class Order(models.Model):
user = models.ForeignKey(User)
status = models.CharField(max_length=20)
created = models.DateTimeField()
class Meta:
indexes = [
models.Index(fields=['user', 'status']), # for filter(user=x, status=y)
models.Index(fields=['status', '-created']), # for filter(status=x).order_by('-created')
]python
class Order(models.Model):
user = models.ForeignKey(User)
status = models.CharField(max_length=20)
created = models.DateTimeField()
class Meta:
indexes = [
models.Index(fields=['user', 'status']), # 用于filter(user=x, status=y)
models.Index(fields=['status', '-created']), # 用于filter(status=x).order_by('-created')
]Validation Checklist for Missing Indexes
缺失索引验证清单
- Table has 10k+ rows
- Field is used in filter() or order_by() on hot path
- Checked model - no db_index=True or Meta.indexes entry
- Not a foreign key (already indexed automatically)
- 表数据量在1万行以上
- 该字段在高频路径的filter()或order_by()中使用
- 检查模型 - 无db_index=True或Meta.indexes配置
- 不是外键(外键会自动创建索引)
Priority 4: Write Loops (HIGH)
优先级4:写入循环(高)
Impact: N database writes instead of 1. Lock contention. Slow requests.
影响: N次数据库写入而非1次。引发锁竞争,请求响应缓慢。
Rule: Use bulk_create instead of create() in loops
规则:在循环中使用bulk_create替代create()
python
undefinedpython
undefinedPROBLEM: N inserts, N round trips
问题:N次插入,N次数据库往返
for item in items:
Model.objects.create(name=item['name'])
for item in items:
Model.objects.create(name=item['name'])
SOLUTION: Single bulk insert
解决方案:单次批量插入
Model.objects.bulk_create([
Model(name=item['name']) for item in items
])
undefinedModel.objects.bulk_create([
Model(name=item['name']) for item in items
])
undefinedRule: Use update() or bulk_update instead of save() in loops
规则:在循环中使用update()或bulk_update替代save()
python
undefinedpython
undefinedPROBLEM: N updates
问题:N次更新
for obj in queryset:
obj.status = 'done'
obj.save()
for obj in queryset:
obj.status = 'done'
obj.save()
SOLUTION A: Single UPDATE statement (same value for all)
解决方案A:单次UPDATE语句(所有对象设置相同值)
queryset.update(status='done')
queryset.update(status='done')
SOLUTION B: bulk_update (different values)
解决方案B:bulk_update(对象值不同时)
for obj in objects:
obj.status = compute_status(obj)
Model.objects.bulk_update(objects, ['status'], batch_size=500)
undefinedfor obj in objects:
obj.status = compute_status(obj)
Model.objects.bulk_update(objects, ['status'], batch_size=500)
undefinedRule: Use delete() on queryset, not in loops
规则:在Queryset上直接调用delete(),而非在循环中删除
python
undefinedpython
undefinedPROBLEM: N deletes
问题:N次删除
for obj in queryset:
obj.delete()
for obj in queryset:
obj.delete()
SOLUTION: Single DELETE
解决方案:单次DELETE操作
queryset.delete()
undefinedqueryset.delete()
undefinedValidation Checklist for Write Loops
写入循环验证清单
- Loop iterates over 100+ items (or unbounded)
- Each iteration calls create(), save(), or delete()
- This runs on user-facing request (not one-time migration script)
- 循环迭代100个以上的项(或无限制)
- 每次循环调用create()、save()或delete()
- 该代码运行在面向用户的请求中(不是一次性迁移脚本)
Priority 5: Inefficient Patterns (LOW)
优先级5:低效模式(低)
Rarely worth reporting. Include only as minor notes if you're already reporting real issues.
通常无需报告。仅当你已报告真实问题时,可作为次要备注提及。
Pattern: count() vs exists()
模式:count() vs exists()
python
undefinedpython
undefinedSlightly suboptimal
略有优化空间
if queryset.count() > 0:
do_thing()
if queryset.count() > 0:
do_thing()
Marginally better
稍好一些
if queryset.exists():
do_thing()
**Usually skip** - difference is <1ms in most cases.if queryset.exists():
do_thing()
**通常可以忽略** - 大多数情况下差异小于1毫秒。Pattern: len(queryset) vs count()
模式:len(queryset) vs count()
python
undefinedpython
undefinedFetches all rows to count
会获取所有行数据来计数
if len(queryset) > 0: # bad if queryset not yet evaluated
if len(queryset) > 0: # 如果Queryset尚未求值,这种方式很差
Single COUNT query
单次COUNT查询
if queryset.count() > 0:
**Only flag** if queryset is large and not already evaluated.if queryset.count() > 0:
**仅在以下情况标记**:Queryset数据量大且尚未被求值。Pattern: get() in small loops
模式:小循环中使用get()
python
undefinedpython
undefinedN queries, but if N is small (< 20), often fine
N次查询,但如果N很小(<20),通常没问题
for id in ids:
obj = Model.objects.get(id=id)
**Only flag** if loop is large or this is in a very hot path.
---for id in ids:
obj = Model.objects.get(id=id)
**仅在以下情况标记**:循环数据量大或这是高频路径。
---Validation Requirements
验证要求
Before reporting ANY issue:
- Trace the data flow - Follow queryset from creation to consumption
- Search for existing optimizations - Grep for select_related, prefetch_related, pagination
- Verify data volume - Check if table is actually large
- Confirm hot path - Trace call sites, verify this runs frequently
- Rule out mitigations - Check for caching, rate limiting
If you cannot validate all steps, do not report.
在报告任何问题前:
- 追踪数据流 - 跟随Queryset从创建到使用的全过程
- 搜索现有优化 - 查找select_related、prefetch_related、分页等相关代码
- 验证数据量 - 检查表是否真的很大
- 确认高频路径 - 追踪调用位置,验证该代码是否频繁运行
- 排除缓解措施 - 检查是否有缓存、限流等机制
如果无法完成所有验证步骤,请勿报告。
Output Format
输出格式
markdown
undefinedmarkdown
undefinedDjango Performance Review: [File/Component Name]
Django性能审查:[文件/组件名称]
Summary
摘要
Validated issues: X (Y Critical, Z High)
已验证问题:X个(Y个严重,Z个高优先级)
Findings
发现的问题
[PERF-001] N+1 Query in UserListView (CRITICAL)
[PERF-001] UserListView中的N+1查询(严重)
Location:
views.py:45Issue: Related field accessed in template loop without prefetch.
profileValidation:
- Traced: UserListView → users queryset → user_list.html → in loop
{{ user.profile.bio }} - Searched codebase: no select_related('profile') found
- User table: 50k+ rows (verified in admin)
- Hot path: linked from homepage navigation
Evidence:
python
def get_queryset(self):
return User.objects.filter(active=True) # no select_relatedFix:
python
def get_queryset(self):
return User.objects.filter(active=True).select_related('profile')
If no issues found: "No performance issues identified after reviewing [files] and validating [what you checked]."
**Before submitting, sanity check each finding:**
- Does the severity match the actual impact? ("Minor inefficiency" ≠ CRITICAL)
- Is this a real performance issue or just a style preference?
- Would fixing this measurably improve performance?
If the answer to any is "no" - remove the finding.
---位置:
views.py:45问题: 在模板循环中访问关联字段但未预取数据。
profile验证:
- 追踪路径:UserListView → users queryset → user_list.html → 循环中使用
{{ user.profile.bio }} - 代码库搜索:未找到select_related('profile')
- 用户表:5万+行数据(在后台管理中验证)
- 高频路径:从首页导航可访问
证据:
python
def get_queryset(self):
return User.objects.filter(active=True) # 无select_related修复方案:
python
def get_queryset(self):
return User.objects.filter(active=True).select_related('profile')
如果未发现问题:「在审查[文件]并验证[检查内容]后,未发现性能问题。」
**提交前,请对每个发现的问题进行合理性检查:**
- 严重程度是否与实际影响匹配?(「轻微低效」≠ 严重)
- 这是真实的性能问题还是只是风格偏好?
- 修复后是否能显著提升性能?
如果任何一个问题的答案是「否」 - 请移除该问题。
---What NOT to Report
请勿报告的内容
- Test files
- Admin-only views
- Management commands
- Migration files
- One-time scripts
- Code behind disabled feature flags
- Tables with <1000 rows that won't grow
- Patterns in cold paths (rarely executed code)
- Micro-optimizations (exists vs count, only/defer without evidence)
- 测试文件
- 仅后台管理使用的视图
- 管理命令
- 迁移文件
- 一次性脚本
- 禁用功能标记后的代码
- 数据量<1000行且不会增长的表
- 低频路径中的模式(很少执行的代码)
- 微优化(exists vs count,无证据的only/defer)
False Positives to Avoid
需要避免的误报
Queryset variable assignment is not an issue:
python
undefinedQueryset变量赋值不是问题:
python
undefinedThis is FINE - no performance difference
这样没问题 - 性能无差异
projects_qs = Project.objects.filter(org=org)
projects = list(projects_qs)
projects_qs = Project.objects.filter(org=org)
projects = list(projects_qs)
vs this - identical performance
和下面的代码性能完全相同
projects = list(Project.objects.filter(org=org))
Querysets are lazy. Assigning to a variable doesn't execute anything.
**Single query patterns are not N+1:**
```pythonprojects = list(Project.objects.filter(org=org))
Queryset是惰性的。赋值给变量不会触发执行。
**单次查询模式不是N+1查询:**
```pythonThis is ONE query, not N+1
这是1次查询,不是N+1
projects = list(Project.objects.filter(org=org))
N+1 requires a loop that triggers additional queries. A single `list()` call is fine.
**Missing select_related on single object fetch is not N+1:**
```pythonprojects = list(Project.objects.filter(org=org))
N+1查询需要循环触发额外查询。单次`list()`调用是没问题的。
**获取单个对象时缺失select_related不是N+1查询:**
```pythonThis is 2 queries, not N+1 - report as LOW at most
这是2次查询,不是N+1 - 最多标记为低优先级
state = AutofixState.objects.filter(pr_id=pr_id).first()
project_id = state.request.project_id # second query
N+1 requires a loop. A single object doing 2 queries instead of 1 can be reported as LOW if relevant, but never as CRITICAL/HIGH.
**Style preferences are not performance issues:**
If your only suggestion is "combine these two lines" or "rename this variable" - that's style, not performance. Don't report it.state = AutofixState.objects.filter(pr_id=pr_id).first()
project_id = state.request.project_id # 第二次查询
N+1查询需要循环。单个对象执行2次查询而非1次,若相关可标记为低优先级,但永远不能标记为严重/高优先级。
**风格偏好不是性能问题:**
如果你唯一的建议是「合并这两行代码」或「重命名这个变量」 - 这是风格问题,不是性能问题。请勿报告。