slop-detector
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAI Slop Detection
AI冗余文本检测
AI slop is identified by patterns of usage rather than individual words. While a single "delve" might be acceptable, its proximity to markers like "tapestry" or "embark" signals generated text. We analyze the density of these markers per 100 words, their clustering, and whether the overall tone fits the document type.
AI冗余文本是通过使用模式而非单个词汇来识别的。虽然单独使用一次“delve”可能是可接受的,但如果它与“tapestry”或“embark”这类特征词汇同时出现,就表明文本是AI生成的。我们会分析每100个单词中这些特征词汇的密度、它们的聚集情况,以及整体语气是否符合文档类型。
Execution Workflow
执行流程
Start by identifying target files and classifying them as technical docs, narrative prose, or code comments. This allows for context-aware scoring during analysis.
首先确定目标文件,并将其分类为技术文档、叙事散文或代码注释。这能让我们在分析时结合上下文进行评分。
Vocabulary and Phrase Detection
词汇与短语检测
Load:
@modules/vocabulary-patterns.mdWe categorize markers into three tiers based on confidence. Tier 1 words appear dramatically more often in AI text and include "delve," "multifaceted," and "leverage." Tier 2 covers context-dependent transitions like "moreover" or "subsequently," while Tier 3 identifies vapid phrases such as "In today's fast-paced world" or "cannot be overstated."
| Word | Context | Human Alternative |
|---|---|---|
| delve | "delve into" | explore, examine, look at |
| tapestry | "rich tapestry" | mix, combination, variety |
| realm | "in the realm of" | in, within, regarding |
| embark | "embark on a journey" | start, begin |
| beacon | "a beacon of" | example, model |
| spearheaded | formal attribution | led, started |
| multifaceted | describing complexity | complex, varied |
| comprehensive | describing scope | thorough, complete |
| pivotal | importance marker | key, important |
| nuanced | sophistication signal | subtle, detailed |
| meticulous/meticulously | care marker | careful, detailed |
| intricate | complexity marker | detailed, complex |
| showcasing | display verb | showing, displaying |
| leveraging | business jargon | using |
| streamline | optimization verb | simplify, improve |
加载:
@modules/vocabulary-patterns.md我们根据置信度将特征标记分为三个等级。Tier 1词汇在AI文本中的出现频率显著更高,包括“delve”“multifaceted”和“leverage”。Tier 2涵盖依赖上下文的过渡词,如“moreover”或“subsequently”,而Tier 3则识别空洞短语,例如“In today's fast-paced world”或“cannot be overstated”。
| 词汇 | 语境 | 人类替代表达 |
|---|---|---|
| delve | "delve into" | explore, examine, look at |
| tapestry | "rich tapestry" | mix, combination, variety |
| realm | "in the realm of" | in, within, regarding |
| embark | "embark on a journey" | start, begin |
| beacon | "a beacon of" | example, model |
| spearheaded | 正式归因表达 | led, started |
| multifaceted | 描述复杂性 | complex, varied |
| comprehensive | 描述范围 | thorough, complete |
| pivotal | 重要性标记 | key, important |
| nuanced | 复杂度信号 | subtle, detailed |
| meticulous/meticulously | 细致度标记 | careful, detailed |
| intricate | 复杂度标记 | detailed, complex |
| showcasing | 展示类动词 | showing, displaying |
| leveraging | 商业术语 | using |
| streamline | 优化类动词 | simplify, improve |
Tier 2: Medium-Confidence Markers (Score: 2 each)
Tier 2:中等置信度标记(分值:每个2分)
Common but context-dependent:
| Category | Words |
|---|---|
| Transition overuse | moreover, furthermore, indeed, notably, subsequently |
| Intensity clustering | significantly, substantially, fundamentally, profoundly |
| Hedging stacks | potentially, typically, often, might, perhaps |
| Action inflation | revolutionize, transform, unlock, unleash, elevate |
| Empty emphasis | crucial, vital, essential, paramount |
常见但依赖语境:
| 类别 | 词汇 |
|---|---|
| 过渡词滥用 | moreover, furthermore, indeed, notably, subsequently |
| 强度词汇聚集 | significantly, substantially, fundamentally, profoundly |
| 模糊表达堆叠 | potentially, typically, often, might, perhaps |
| 夸大动作表达 | revolutionize, transform, unlock, unleash, elevate |
| 空洞强调词 | crucial, vital, essential, paramount |
Tier 3: Phrase Patterns (Score: 2-4 each)
Tier 3:短语模式(分值:每个2-4分)
| Phrase | Score | Issue |
|---|---|---|
| "In today's fast-paced world" | 4 | Vapid opener |
| "It's worth noting that" | 3 | Filler |
| "At its core" | 2 | Positional crutch |
| "Cannot be overstated" | 3 | Empty emphasis |
| "A testament to" | 3 | Attribution cliche |
| "Navigate the complexities" | 4 | Business speak |
| "Unlock the potential" | 4 | Marketing speak |
| "Treasure trove of" | 3 | Overused metaphor |
| "Game changer" | 3 | Buzzword |
| "Look no further" | 4 | Sales pitch |
| "Nestled in the heart of" | 4 | Travel writing cliche |
| "Embark on a journey" | 4 | Melodrama |
| "Ever-evolving landscape" | 4 | Tech cliche |
| "Hustle and bustle" | 3 | Filler |
| 短语 | 分值 | 问题 |
|---|---|---|
| "In today's fast-paced world" | 4 | 空洞开篇 |
| "It's worth noting that" | 3 | 填充内容 |
| "At its core" | 2 | 位置类套话 |
| "Cannot be overstated" | 3 | 空洞强调 |
| "A testament to" | 3 | 归因类陈词滥调 |
| "Navigate the complexities" | 4 | 商务套话 |
| "Unlock the potential" | 4 | 营销类套话 |
| "Treasure trove of" | 3 | 过度使用的比喻 |
| "Game changer" | 3 | 流行术语 |
| "Look no further" | 4 | 销售话术 |
| "Nestled in the heart of" | 4 | 旅行写作类陈词滥调 |
| "Embark on a journey" | 4 | 夸张表达 |
| "Ever-evolving landscape" | 4 | 科技类陈词滥调 |
| "Hustle and bustle" | 3 | 填充内容 |
Step 3: Structural Pattern Detection
第三步:结构模式检测
Load:
@modules/structural-patterns.md加载:
@modules/structural-patterns.mdEm Dash Overuse
破折号滥用
Count em dashes (—) per 1000 words:
- 0-2: Normal human range
- 3-5: Elevated, review usage
- 6+: Strong AI signal
bash
undefined统计每1000个单词中的破折号(—)数量:
- 0-2个:正常人类写作范围
- 3-5个:数量偏高,需检查使用方式
- 6个及以上:强烈的AI生成信号
bash
undefinedCount em dashes in file
Count em dashes in file
grep -o '—' file.md | wc -l
undefinedgrep -o '—' file.md | wc -l
undefinedTricolon Detection
三形容词排比检测
AI loves groups of three with alliteration:
- "fast, efficient, and reliable"
- "clear, concise, and compelling"
- "robust, reliable, and resilient"
Pattern: with similar sounds.
adjective, adjective, and adjectiveAI偏爱押头韵的三词组合:
- "fast, efficient, and reliable"
- "clear, concise, and compelling"
- "robust, reliable, and resilient"
模式: 且发音相近。
形容词, 形容词, and 形容词List-to-Prose Ratio
列表与散文占比
Count bullet points vs paragraph sentences:
- >60% bullets: AI tendency
- Emoji-led bullets: Strong AI signal in technical docs
统计项目符号与段落句子的比例:
- 项目符号占比>60%:AI写作倾向
- 表情符号引导的项目符号:技术文档中强烈的AI生成信号
Sentence Length Uniformity
句子长度一致性
Measure standard deviation of sentence lengths:
- Low variance (SD < 5 words): AI monotony
- High variance (SD > 10 words): Human variation
测量句子长度的标准差:
- 低方差(标准差<5个单词):AI写作的单调性
- 高方差(标准差>10个单词):人类写作的多样性
Paragraph Symmetry
段落对称性
AI produces "blocky" text with uniform paragraph lengths. Check if paragraphs cluster around the same word count.
AI会生成“块状”文本,段落长度均匀。检查段落是否聚集在相同的单词数附近。
Step 4: Sycophantic Pattern Detection
第四步:谄媚模式检测
Especially relevant for conversational or instructional content:
| Phrase | Issue |
|---|---|
| "I'd be happy to" | Servile opener |
| "Great question!" | Empty validation |
| "Absolutely!" | Over-agreement |
| "That's a wonderful point" | Flattery |
| "I'm glad you asked" | Filler |
| "You're absolutely right" | Sycophancy |
These phrases add no information and signal generated content.
在对话式或指导性内容中尤为常见:
| 短语 | 问题 |
|---|---|
| "I'd be happy to" | 过度客套的开篇 |
| "Great question!" | 空洞的肯定 |
| "Absolutely!" | 过度赞同 |
| "That's a wonderful point" | 奉承表达 |
| "I'm glad you asked" | 填充内容 |
| "You're absolutely right" | 谄媚表达 |
这些短语没有实际信息,是AI生成内容的信号。
Step 5: Calculate Slop Density Score
第五步:计算冗余内容密度得分
slop_score = (tier1_count * 3 + tier2_count * 2 + phrase_count * avg_phrase_score) / word_count * 100| Score | Rating | Action |
|---|---|---|
| 0-1.0 | Clean | No action needed |
| 1.0-2.5 | Light | Spot remediation |
| 2.5-5.0 | Moderate | Section rewrite recommended |
| 5.0+ | Heavy | Full document review |
slop_score = (tier1_count * 3 + tier2_count * 2 + phrase_count * avg_phrase_score) / word_count * 100| 得分 | 评级 | 操作建议 |
|---|---|---|
| 0-1.0 | 无冗余 | 无需操作 |
| 1.0-2.5 | 轻度冗余 | 局部修正 |
| 2.5-5.0 | 中度冗余 | 建议重写对应章节 |
| 5.0+ | 重度冗余 | 需全面审核文档 |
Step 6: Generate Report
第六步:生成报告
Output format:
markdown
undefined输出格式:
markdown
undefinedSlop Detection Report: [filename]
Slop Detection Report: [filename]
Overall Score: X.X / 10 (Rating)
Word Count: N words
Markers Found: N total
Overall Score: X.X / 10 (Rating)
Word Count: N words
Markers Found: N total
High-Confidence Markers
High-Confidence Markers
- Line 23: "delve into" -> consider: "explore"
- Line 45: "rich tapestry" -> consider: "variety"
- Line 23: "delve into" -> consider: "explore"
- Line 45: "rich tapestry" -> consider: "variety"
Structural Issues
Structural Issues
- Em dash density: 8/1000 words (HIGH)
- Bullet ratio: 72% (ELEVATED)
- Sentence length SD: 3.2 words (LOW VARIANCE)
- Em dash density: 8/1000 words (HIGH)
- Bullet ratio: 72% (ELEVATED)
- Sentence length SD: 3.2 words (LOW VARIANCE)
Phrase Patterns
Phrase Patterns
- Line 12: "In today's fast-paced world" (vapid opener)
- Line 89: "cannot be overstated" (empty emphasis)
- Line 12: "In today's fast-paced world" (vapid opener)
- Line 89: "cannot be overstated" (empty emphasis)
Recommendations
Recommendations
- Replace [specific word] with [alternative]
- Convert bullet list at line 34-56 to prose
- Vary sentence structure in paragraphs 3-5
undefined- Replace [specific word] with [alternative]
- Convert bullet list at line 34-56 to prose
- Vary sentence structure in paragraphs 3-5
undefinedModule Reference
模块参考
- See for narrative-specific slop markers
modules/fiction-patterns.md - See for fix recommendations
modules/remediation-strategies.md
- 查看获取叙事类文本的冗余特征标记
modules/fiction-patterns.md - 查看获取修复建议
modules/remediation-strategies.md
Integration with Remediation
与修复流程集成
After detection, invoke with flag to apply fixes, or manually edit using the report as a guide.
Skill(scribe:doc-generator)--remediate检测完成后,调用并添加参数来应用修复,或根据报告手动编辑内容。
Skill(scribe:doc-generator)--remediateExit Criteria
退出标准
- All target files scanned
- Density scores calculated
- Report generated with actionable recommendations
- High-severity items flagged for immediate attention
- 所有目标文件已扫描
- 已计算密度得分
- 已生成包含可操作建议的报告
- 高优先级问题已标记为需立即处理