ln-651-query-efficiency-auditor
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseQuery Efficiency Auditor (L3 Worker)
查询效率审计器(L3 工作器)
Specialized worker auditing database query patterns for redundancy, inefficiency, and misuse.
专为审计数据库查询模式中的冗余、低效和误用问题设计的工作器。
Purpose & Scope
目标与范围
- Worker in ln-650 coordinator pipeline - invoked by ln-650-persistence-performance-auditor
- Audit query efficiency (Priority: HIGH)
- Check redundant fetches, batch operation misuse, caching scope problems
- Return structured findings with severity, location, effort, recommendations
- Calculate compliance score (X/10) for Query Efficiency category
- ln-650 协调器流水线中的工作器 - 由ln-650-persistence-performance-auditor调用
- 审计查询效率(优先级:高)
- 检查冗余获取、批量操作误用、缓存范围问题
- 返回包含严重程度、位置、修复工作量和建议的结构化检测结果
- 计算查询效率类别的合规分数(X/10)
Inputs (from Coordinator)
输入(来自协调器)
MANDATORY READ: Load for contextStore structure.
shared/references/task_delegation_pattern.md#audit-coordinator--worker-contractReceives with: , , (database type, ORM settings), .
contextStoretech_stackbest_practicesdb_configcodebase_rootDomain-aware: Supports + .
domain_modecurrent_domain必读: 加载以了解contextStore结构。
shared/references/task_delegation_pattern.md#audit-coordinator--worker-contract接收包含以下内容的:、、(数据库类型、ORM设置)、。
contextStoretech_stackbest_practicesdb_configcodebase_root领域感知: 支持 + 。
domain_modecurrent_domainWorkflow
工作流程
-
Parse context from contextStore
- Extract tech_stack, best_practices, db_config
- Determine scan_path (same logic as ln-624)
-
Scan codebase for violations
- All Grep/Glob patterns use
scan_path - Trace call chains for redundant fetches (requires reading caller + callee)
- All Grep/Glob patterns use
-
Collect findings with severity, location, effort, recommendation
-
Calculate score using penalty algorithm
-
Return JSON result to coordinator
-
解析contextStore中的上下文
- 提取tech_stack、best_practices、db_config
- 确定扫描路径(与ln-624相同的逻辑)
-
扫描代码库查找违规问题
- 所有Grep/Glob模式均使用
scan_path - 追踪调用链以发现冗余获取(需要读取调用方和被调用方代码)
- 所有Grep/Glob模式均使用
-
收集包含严重程度、位置、修复工作量和建议的检测结果
-
使用惩罚算法计算分数
-
向协调器返回JSON结果
Audit Rules (Priority: HIGH)
审计规则(优先级:高)
1. Redundant Entity Fetch
1. 冗余实体获取
What: Same entity fetched from DB twice in a call chain
Detection:
- Find function A that calls or
repo.get(id), then passessession.get(Model, id)(not object) to function Bid - Function B also calls or
repo.get(id)for the same entitysession.get(Model, id) - Common pattern: returns job, but
acquire_next_pending()re-fetches it_process_job(job_id)
Detection patterns (Python/SQLAlchemy):
- Grep for in service/handler files
repo.*get_by_id|session\.get\(|session\.query.*filter.*id - Trace: if function receives AND internally does
entity_id: int/UUID, check if caller already has entity objectrepo.get(entity_id) - Check setting: if
expire_on_commit, objects remain valid after commitFalse
Severity:
- HIGH: Redundant fetch in API request handler (adds latency per request)
- MEDIUM: Redundant fetch in background job (less critical)
Recommendation: Pass entity object instead of ID, or remove second fetch when
expire_on_commit=FalseEffort: S (change signature to accept object instead of ID)
问题描述: 同一实体在一个调用链中从数据库获取两次
检测方式:
- 找到函数A调用或
repo.get(id),然后将session.get(Model, id)(而非对象)传递给函数Bid - 函数B也针对同一实体调用或
repo.get(id)session.get(Model, id) - 常见模式:返回job,但
acquire_next_pending()重新获取该job_process_job(job_id)
检测模式(Python/SQLAlchemy):
- 在服务/处理程序文件中搜索
repo.*get_by_id|session\.get\(|session\.query.*filter.*id - 追踪:如果函数接收且内部执行
entity_id: int/UUID,检查调用方是否已拥有该实体对象repo.get(entity_id) - 检查设置:如果为
expire_on_commit,提交后对象仍保持有效False
严重程度:
- 高: API请求处理程序中的冗余获取(每次请求增加延迟)
- 中: 后台任务中的冗余获取(影响较小)
建议: 传递实体对象而非ID,或者当时移除第二次获取
expire_on_commit=False修复工作量: S(修改签名以接受对象而非ID)
2. N-UPDATE/DELETE Loop
2. N次更新/删除循环
What: Loop of individual UPDATE/DELETE operations instead of single batch query
Detection:
- Pattern: or
for item in items: await repo.update(item.id, ...)for item in items: await repo.delete(item.id) - Pattern:
for item in items: session.execute(update(Model).where(...))
Detection patterns:
- Grep for followed by
for .* in .*:within 1-3 linesrepo\.(update|delete|reset|save|mark_) - Grep for followed by
for .* in .*:within 1-3 linessession\.execute\(.*update\(
Severity:
- HIGH: Loop over >10 items (N separate round-trips to DB)
- MEDIUM: Loop over <=10 items
Recommendation: Replace with single or
UPDATE ... WHERE id IN (...)session.execute(update(Model).where(Model.id.in_(ids)))Effort: M (rewrite query + test)
问题描述: 循环执行单个更新/删除操作,而非单次批量查询
检测方式:
- 模式:或
for item in items: await repo.update(item.id, ...)for item in items: await repo.delete(item.id) - 模式:
for item in items: session.execute(update(Model).where(...))
检测模式:
- 搜索后1-3行内存在
for .* in .*:的代码repo\.(update|delete|reset|save|mark_) - 搜索后1-3行内存在
for .* in .*:的代码session\.execute\(.*update\(
严重程度:
- 高: 循环处理超过10个条目(N次独立的数据库往返)
- 中: 循环处理<=10个条目
建议: 替换为单次或
UPDATE ... WHERE id IN (...)session.execute(update(Model).where(Model.id.in_(ids)))修复工作量: M(重写查询并测试)
3. Unnecessary Resolve
3. 不必要的解析
What: Re-resolving a value from DB when it is already available in the caller's scope
Detection:
- Method receives and resolves engine from it, but caller already determined
profile_idengine - Method receives and looks up dialect_id, but caller already has both
lang_codeandlangdialect - Pattern: function receives , does
X_id, extractsget(X_id), when caller already has.fieldfield
Severity:
- MEDIUM: Extra DB query per invocation, especially in high-frequency paths
Recommendation: Split method into two variants: and ; or pass resolved value directly
with_known_value(value, ...)resolving_value(id, ...)Effort: S-M (refactor signature, update callers)
问题描述: 当调用方作用域中已存在该值时,仍从数据库重新解析该值
检测方式:
- 方法接收并从中解析engine,但调用方已确定
profile_idengine - 方法接收并查找dialect_id,但调用方已同时拥有
lang_code和langdialect - 模式:函数接收,执行
X_id并提取get(X_id),但调用方已拥有该.fieldfield
严重程度:
- 中: 每次调用额外执行一次数据库查询,尤其是在高频路径中
建议: 将方法拆分为两个变体:和;或者直接传递已解析的值
with_known_value(value, ...)resolving_value(id, ...)修复工作量: S-M(重构签名,更新调用方)
4. Over-Fetching
4. 过度获取
What: Loading full ORM model when only few fields are needed
Detection:
- or
session.query(Model)withoutselect(Model)for models with >10 columns.options(load_only(...)) - Especially in list/search endpoints that return many rows
- Pattern: loading full entity but only using 2-3 fields
Severity:
- MEDIUM: Large models (>15 columns) in list endpoints
- LOW: Small models (<10 columns) or single-entity endpoints
Recommendation: Use , , or raw for list queries
load_only()defer()select(Model.col1, Model.col2)Effort: S (add load_only to query)
问题描述: 仅需要少数字段时却加载完整ORM模型
检测方式:
- 或
session.query(Model)未使用select(Model),且模型包含超过10个列.options(load_only(...)) - 尤其在返回多行数据的列表/搜索端点中
- 模式:加载完整实体但仅使用2-3个字段
严重程度:
- 中: 列表端点中使用大型模型(>15列)
- 低: 使用小型模型(<10列)或单实体端点
建议: 对列表查询使用、或原生
load_only()defer()select(Model.col1, Model.col2)修复工作量: S(向查询添加load_only)
5. Missing Bulk Operations
5. 缺失批量操作
What: Sequential INSERT/DELETE/UPDATE instead of bulk operations
Detection:
- instead of
for item in items: session.add(item)session.add_all(items) - instead of bulk delete
for item in items: session.delete(item) - Pattern: loop with single per iteration
INSERT
Severity:
- MEDIUM: Any sequential add/delete in loop (missed batch optimization)
Recommendation: Use , ,
session.add_all()session.execute(insert(Model).values(list_of_dicts))bulk_save_objects()Effort: S (replace loop with bulk call)
问题描述: 顺序执行INSERT/DELETE/UPDATE而非批量操作
检测方式:
- 使用而非
for item in items: session.add(item)session.add_all(items) - 使用而非批量删除
for item in items: session.delete(item) - 模式:循环中每次迭代执行单次
INSERT
严重程度:
- 中: 循环中任何顺序添加/删除操作(错失批量优化机会)
建议: 使用、、
session.add_all()session.execute(insert(Model).values(list_of_dicts))bulk_save_objects()修复工作量: S(用批量调用替换循环)
6. Wrong Caching Scope
6. 错误的缓存范围
What: Request-scoped cache for data that rarely changes (should be app-scoped)
Detection:
- Service registered as request-scoped (e.g., via FastAPI ) with internal cache (
Depends()dict,_cacheflag)_loaded - Cache populated by expensive query (JOINs, aggregations) per each request
- Data TTL >> request duration (e.g., engine configurations, language lists, feature flags)
Detection patterns:
- Find classes with ,
_cache,_loadedattributes_initialized - Check if class is created per-request (via DI registration scope)
- Compare: data change frequency vs cache lifetime
Severity:
- HIGH: Expensive query (JOINs, subqueries) cached only per-request
- MEDIUM: Simple query cached per-request
Recommendation: Move cache to app-scoped service (singleton), add TTL-based invalidation, or use CacheService with configurable TTL
Effort: M (change DI scope, add TTL logic)
问题描述: 对很少变化的数据使用请求作用域缓存(应使用应用作用域)
检测方式:
- 服务被注册为请求作用域(例如通过FastAPI ),且内部包含缓存(
Depends()字典、_cache标志)_loaded - 每个请求都通过昂贵的查询(JOIN、聚合)填充缓存
- 数据TTL远大于请求时长(例如引擎配置、语言列表、功能标志)
检测模式:
- 查找包含、
_cache、_loaded属性的类_initialized - 检查该类是否按请求创建(通过DI注册作用域)
- 比较:数据变更频率 vs 缓存生命周期
严重程度:
- 高: 昂贵查询(JOIN、子查询)仅按请求缓存
- 中: 简单查询按请求缓存
建议: 将缓存移至应用作用域服务(单例),添加基于TTL的失效机制,或使用可配置TTL的CacheService
修复工作量: M(更改DI作用域,添加TTL逻辑)
Scoring Algorithm
评分算法
See for unified formula and score interpretation.
shared/references/audit_scoring.md请查看获取统一公式和分数解释。
shared/references/audit_scoring.mdOutput Format
输出格式
Return JSON to coordinator:
json
{
"category": "Query Efficiency",
"score": 6,
"total_issues": 8,
"critical": 0,
"high": 3,
"medium": 4,
"low": 1,
"findings": [
{
"severity": "HIGH",
"location": "app/infrastructure/messaging/job_processor.py:434",
"issue": "Redundant entity fetch: job re-fetched by ID after acquire_next_pending already returned it",
"principle": "Query Efficiency / DRY Data Access",
"recommendation": "Pass job object to _process_job instead of job_id",
"effort": "S"
}
]
}向协调器返回JSON:
json
{
"category": "Query Efficiency",
"score": 6,
"total_issues": 8,
"critical": 0,
"high": 3,
"medium": 4,
"low": 1,
"findings": [
{
"severity": "HIGH",
"location": "app/infrastructure/messaging/job_processor.py:434",
"issue": "Redundant entity fetch: job re-fetched by ID after acquire_next_pending already returned it",
"principle": "Query Efficiency / DRY Data Access",
"recommendation": "Pass job object to _process_job instead of job_id",
"effort": "S"
}
]
}Critical Rules
关键规则
- Do not auto-fix: Report only
- Trace call chains: Rules 1 and 3 require reading both caller and callee
- ORM-aware: Check ,
expire_on_commit, session scope before flagging redundant fetchesautoflush - Context-aware: Small datasets or infrequent operations may justify simpler code
- Exclude tests: Do not flag test fixtures or setup code
- 请勿自动修复: 仅报告问题
- 追踪调用链: 规则1和3需要读取调用方和被调用方代码
- ORM感知: 在标记冗余获取前,检查、
expire_on_commit、会话作用域autoflush - 上下文感知: 小型数据集或低频率操作可能更适合使用简单代码
- 排除测试代码: 请勿标记测试夹具或设置代码
Definition of Done
完成标准
- contextStore parsed (tech_stack, db_config, ORM settings)
- scan_path determined (domain path or codebase root)
- All 6 checks completed:
- redundant fetch, N-UPDATE loop, unnecessary resolve, over-fetching, bulk ops, caching scope
- Findings collected with severity, location, effort, recommendation
- Score calculated
- JSON returned to coordinator
Version: 1.0.0
Last Updated: 2026-02-04
- 已解析contextStore(tech_stack、db_config、ORM设置)
- 已确定扫描路径(领域路径或代码库根目录)
- 已完成所有6项检查:
- 冗余获取、N次更新循环、不必要的解析、过度获取、批量操作缺失、缓存范围错误
- 已收集包含严重程度、位置、修复工作量和建议的检测结果
- 已计算分数
- 已向协调器返回JSON结果
版本: 1.0.0
最后更新: 2026-02-04