humanize
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseHumanize: AI Pattern Detection and Removal
Humanize:AI写作模式检测与移除
Remove AI-generated writing patterns from text. Produce natural, human-sounding output that preserves meaning.
This is not a generic rewriter. It targets specific, documented AI-writing patterns catalogued by Wikipedia's WikiProject AI Cleanup from thousands of observed instances.
移除文本中的AI生成写作模式,生成保留原意、自然流畅的类人写作输出。
这并非通用改写工具,它针对维基百科AI清理项目从数千个实例中整理出的特定、有记录的AI写作模式。
Workflow
工作流程
Five phases. Each phase has a clear input, transformation, and output. Do not skip phases.
分为五个阶段,每个阶段都有明确的输入、转换和输出,不得跳过任何阶段。
Phase 1: Detection Scan
第一阶段:检测扫描
Read the input text. Load . Scan for two categories of signals:
references/detection-patterns.mdA. Lexical patterns (the 24 catalogued AI-writing patterns):
| Category | Patterns | Priority |
|---|---|---|
| Content inflation | Significance puffing, notability claims, superficial -ing analyses, promotional language, vague attributions, formulaic challenges sections | HIGH — loudest AI tells |
| Vocabulary | AI-frequency words, copula avoidance, filler phrases, excessive hedging | HIGH — statistically detectable |
| Structure | Rule of three, negative parallelisms, elegant variation, false ranges, inline-header lists | MEDIUM — structural fingerprints |
| Style | Em dash overuse, boldface overuse, title case headings, emoji decoration, curly quotes | MEDIUM — formatting tells |
| Communication | Chatbot artifacts, knowledge-cutoff disclaimers, sycophantic tone, generic conclusions | LOW — obvious, usually caught by author |
B. Statistical regularity signals (see ):
references/statistical-signals.md| Signal | What to look for |
|---|---|
| Sentence length uniformity | Sentences clustering within a narrow word-count range |
| Low clause density variation | Every sentence has the same number of clauses |
| Flat information density | Every sentence carries roughly the same amount of detail |
| High-frequency phrase templates | Stock collocations and common bigrams/trigrams dominating the text |
| Excessive transition markers | Formal connectives appearing more than 8 per 1,000 words |
| Structural symmetry | Paragraphs and sentences following balanced, mirror-like patterns |
| Uniform inter-sentence cohesion | Every sentence tightly follows the previous with no topic shifts or digressions |
| Generic function word usage | Connectors and prepositions used in textbook-standard distribution with no personal tendencies |
Output a detection report using the detection report template (see Output Format).
Instance severity rating:
| Severity | Criteria |
|---|---|
| HIGH | 3+ patterns co-occurring in a single paragraph, or any paragraph saturated with AI vocabulary (5+ signal words) |
| MEDIUM | 1-2 patterns in a paragraph, or a statistical signal present across 3+ consecutive sentences |
| LOW | Isolated single instance of any pattern, or a borderline statistical signal |
阅读输入文本,加载文件,扫描两类信号:
references/detection-patterns.mdA. 词汇模式(24种已归类的AI写作模式):
| 类别 | 模式 | 优先级 |
|---|---|---|
| 内容注水 | 夸大重要性、显著性声明、表面化的-ing形式分析、宣传性语言、模糊归因、公式化的挑战部分 | 高优先级——最明显的AI写作特征 |
| 词汇 | AI高频词、避免使用系动词、填充性短语、过度模糊表述 | 高优先级——可通过统计检测 |
| 结构 | 三段式结构、否定平行句式、刻意换词、虚假范围、行内标题列表 | 中优先级——结构特征 |
| 风格 | 过度使用破折号、过度使用粗体、标题大小写、表情符号装饰、弯引号 | 中优先级——格式特征 |
| 沟通风格 | 聊天机器人痕迹、知识截止日期声明、谄媚语气、通用结论 | 低优先级——明显易见,通常已被作者发现 |
B. 统计规律性信号(详见):
references/statistical-signals.md| 信号 | 检测要点 |
|---|---|
| 句子长度一致性 | 句子字数集中在狭窄范围内 |
| 从句密度变化小 | 每个句子的从句数量相同 |
| 信息密度均匀 | 每个句子承载的信息量大致相同 |
| 高频短语模板化 | 固定搭配和常见二元/三元短语在文本中占主导 |
| 过渡标记过度使用 | 正式连接词使用频率超过每1000词8次 |
| 结构对称性 | 段落和句子遵循平衡的镜像结构 |
| 句间衔接过于紧密 | 每个句子都严格承接上一句,无主题转换或偏离 |
| 功能词使用通用化 | 连接词和介词完全按照教科书标准使用,无个人风格 |
使用检测报告模板输出检测结果(详见输出格式)。
实例严重程度评级:
| 严重程度 | 判定标准 |
|---|---|
| 高 | 单个段落中同时出现3种及以上模式,或任意段落充斥AI高频词汇(5个及以上特征词) |
| 中 | 单个段落中出现1-2种模式,或连续3句及以上存在统计信号 |
| 低 | 仅出现孤立的单个模式实例,或存在临界统计信号 |
Phase 2: Structural Rewrite
第二阶段:结构改写
Transform document structure to break AI-typical organization:
- Convert uniform paragraph lengths to varied blocks
- Merge or split sentences to break rhythmic uniformity
- Reorder clauses where meaning permits
- Convert formulaic list structures to narrative where appropriate
- Remove tripartite constructions unless the content genuinely has three parts
Do not change factual content. Do not add information. Do not remove cited sources, data, or technical terms.
调整文档结构,打破AI典型的组织方式:
- 将均匀的段落长度改为长短不一的块
- 合并或拆分句子,打破节奏一致性
- 在不改变原意的前提下重新排列从句
- 酌情将公式化列表结构转换为叙述性内容
- 除非内容确实包含三部分,否则移除三段式结构
不得修改事实内容,不得添加信息,不得删除引用来源、数据或专业术语。
Phase 3: Vocabulary and Style Pass
第三阶段:词汇与风格优化
Apply pattern-specific rewrites from the detection report:
- Replace AI-frequency vocabulary with natural alternatives
- Restore simple copulas (is/are/has) where the text uses elaborate substitutes
- Remove filler phrases and excessive hedging
- Cut promotional language and significance inflation
- Replace vague attributions with specific ones (or remove if no source exists)
Load the appropriate style profile from based on the target domain. Apply domain-specific voice calibration.
references/style-guide.md根据检测报告应用针对性改写:
- 用自然替代词替换AI高频词汇
- 在文本使用复杂替代表达的地方恢复简单系动词(is/are/has等)
- 删除填充性短语和过度模糊表述
- 删减宣传性语言和夸大的重要性表述
- 将模糊归因替换为具体表述(若无来源则移除)
根据目标领域从加载对应的风格配置文件,应用领域特定的语气校准。
references/style-guide.mdPhase 4: Entropy and Variation
第四阶段:增加随机性与多样性
Human writing has burstiness — irregular rhythm, varied sentence lengths, uneven information density. AI text is statistically smooth. This phase breaks that smoothness.
Load for target ranges. Apply:
references/statistical-signals.md- Sentence length variance: mix short declarative with longer explanatory. Target visible variance across any 5-sentence window.
- Clause density variation: alternate simple sentences (one clause) with compound/complex (2-3 clauses). Do not settle on a uniform clause count.
- Information density variation: let some sentences carry heavy detail while others are light — a summary statement, a reaction, a pivot. Uniform density reads as generated.
- Phrase template breaking: replace stock collocations with specific phrasings. "Play a role in" -> name the specific action. "In terms of" -> delete or restructure.
- Inter-sentence cohesion variation: not every sentence should tightly follow the previous. Allow small topic expansions, brief asides, or contextual jumps that a thinking human would make.
- Function word personalization: vary connector usage. Use "but" in one place, "still" in another, nothing in a third. Do not default to the same conjunction pattern throughout.
- Paragraph length variance: mix single-sentence paragraphs with 4-5 sentence blocks.
- Controlled imperfection: fragments at impact positions, parenthetical asides, concessive turns. Sparingly — seasoning, not structure.
人类写作具有突发性——节奏不规则、句子长度多变、信息密度不均,而AI文本在统计上过于平滑。本阶段旨在打破这种平滑性。
加载获取目标范围,应用以下调整:
references/statistical-signals.md- 句子长度变化:混合使用简短的陈述句和较长的解释句,确保任意5句窗口内可见明显的长度差异。
- 从句密度变化:交替使用简单句(1个从句)和并列/复合句(2-3个从句),避免从句数量统一。
- 信息密度变化:让部分句子承载大量细节,部分句子简洁概括——可以是总结性语句、反应或话题转换,均匀的信息密度会显得像AI生成。
- 打破短语模板:用具体表述替换固定搭配,例如将“Play a role in”替换为具体的动作描述,将“In terms of”删除或重构。
- 句间衔接变化:并非每个句子都必须严格承接上一句,允许出现符合人类思考逻辑的小范围话题拓展、简短题外话或上下文跳转。
- 功能词个性化:多样化连接词的使用,比如此处用“but”,彼处用“still”,另一处不使用连接词,避免全程使用相同的连词模式。
- 段落长度变化:混合使用单句段落和4-5句的长段落。
- 可控的不完美:在关键位置使用碎片句、插入语或让步转折,点到为止——作为调味而非结构主体。
Phase 5: Validation and Output
第五阶段:验证与输出
Two checks before delivering:
Semantic check: Compare rewrite against original. Every factual claim, data point, argument, and technical term in the original must be present in the rewrite. If anything was lost, restore it.
Self-audit: Ask internally: "What still sounds AI-generated about this text?" If residual patterns remain, fix them. One pass only — do not loop indefinitely.
Output the final text followed by a brief changes summary.
交付前需完成两项检查:
语义检查:将改写后的文本与原文对比,原文中的每一个事实主张、数据点、论点和专业术语都必须在改写版本中保留。若有内容丢失,需恢复。
自我审核:内部自问:“这段文字还有哪些地方听起来像AI生成的?”若仍存在残留模式,需修正。仅执行一次审核——不得无限循环。
输出最终文本及简短的修改总结。
Output Format
输出格式
Full Rewrite / Targeted Fix / Style Shift
完整改写 / 针对性修复 / 风格转换
[Humanized text]
---
Changes: [2-4 bullet summary of what was changed and why]
Patterns detected: [list of pattern numbers/names found]
Domain: [detected or specified domain]For short texts (under 100 words), skip the changes summary unless the user requests it.
[人性化处理后的文本]
---
修改说明:[2-4条要点,说明修改内容及原因]
检测到的模式:[已发现的模式编号/名称列表]
领域:[检测或指定的领域]对于短文本(不足100词),除非用户要求,否则可跳过修改总结。
Detection Only
仅检测
undefinedundefinedDetection Report
检测报告
Domain: [detected or specified]
Overall severity: [HIGH / MEDIUM / LOW]
Patterns found: [count]
领域: [检测或指定的领域]
整体严重程度: [高 / 中 / 低]
发现的模式数量: [数量]
Findings
检测结果
| Location | Pattern | Severity | Evidence |
|---|---|---|---|
| Para 1 | #7 AI vocabulary | HIGH | "delve", "intricate", "pivotal" in same sentence |
| Para 2 | #8 Copula avoidance | MEDIUM | "serves as" instead of "is" |
| Para 1-4 | Sentence length uniformity | MEDIUM | All sentences 18-22 words, SD < 3 |
| ... | ... | ... | ... |
| 位置 | 模式 | 严重程度 | 证据 |
|---|---|---|---|
| 第1段 | #7 AI高频词汇 | 高 | 同一句中出现“delve”“intricate”“pivotal” |
| 第2段 | #8 避免使用系动词 | 中 | 使用“serves as”而非“is” |
| 第1-4段 | 句子长度一致性 | 中 | 所有句子长度为18-22词,标准差<3 |
| ... | ... | ... | ... |
Statistical Signals
统计信号
| Signal | Status | Detail |
|---|---|---|
| Sentence length variance | FLAG | SD ~3 words (human typical: 7-15) |
| Transition frequency | OK | 5 per 1,000 words |
| ... | ... | ... |
| 信号 | 状态 | 详情 |
|---|---|---|
| 句子长度变化 | 标记异常 | 标准差约3词(人类写作典型值:7-15) |
| 过渡词频率 | 正常 | 每1000词5次 |
| ... | ... | ... |
Summary
总结
[1-2 sentences: overall assessment and highest-priority patterns to fix first]
undefined[1-2句话:整体评估及最需优先修复的模式]
undefinedReference Files
参考文件
| File | Purpose | Load When |
|---|---|---|
| 24 AI-writing patterns with examples | Always (Phase 1) |
| 12 statistical regularity signals with target ranges | Phase 1 (scan) and Phase 4 (targets) |
| Domain-specific voice profiles and calibration rules | Phase 3 (match to domain) |
| Structural rewrite strategies and entropy techniques | Phase 2 and Phase 4 |
| Before/after pairs for academic writing | When domain is academic |
| Before/after pairs for blog/casual writing | When domain is blog or social |
| Before/after pairs for professional/business writing | When domain is professional |
| 文件 | 用途 | 加载时机 |
|---|---|---|
| 24种AI写作模式及示例 | 始终加载(第一阶段) |
| 12种统计规律性信号及目标范围 | 第一阶段(扫描)和第四阶段(调整) |
| 领域特定语气配置及校准规则 | 第三阶段(匹配领域) |
| 结构改写策略及随机性调整技巧 | 第二阶段和第四阶段 |
| 学术写作的前后对比示例 | 领域为学术时 |
| 博客/休闲写作的前后对比示例 | 领域为博客或社交媒体时 |
| 专业/商务写作的前后对比示例 | 领域为专业文书时 |
Domain Detection
领域检测
If the user does not specify a domain, infer from:
- Vocabulary density and jargon type
- Citation patterns
- Sentence complexity
- Register (formal/informal markers)
Default to professional if ambiguous.
Supported domains: , , , , ,
academictechnicalblogsocialprofessionalmarketing若用户未指定领域,可从以下方面推断:
- 词汇密度和术语类型
- 引用模式
- 句子复杂度
- 语域(正式/非正式标记)
若存在歧义,默认使用专业文书领域。
支持的领域:(学术)、(技术)、(博客)、(社交媒体)、(专业文书)、(营销)
academictechnicalblogsocialprofessionalmarketingBehavioral Constraints
行为约束
- Never fabricate. Do not add facts, citations, quotes, statistics, or claims not in the original.
- Never remove data. Numbers, dates, names, URLs, and cited sources must survive the rewrite.
- Preserve argument structure. If the original makes points A, B, C in that order with that logic, the rewrite must preserve the logical flow.
- Do not over-humanize. Some text is meant to be neutral and informational. A technical specification does not need personality. Match the appropriate register.
- Respect code blocks and structured data. Do not humanize code, tables, JSON, YAML, or any structured/machine-readable content. Pass these through unchanged.
- One pass through the pipeline. Do not run the 5-phase pipeline recursively. If the output still has tells after Phase 5, note them in the changes summary rather than looping.
- 不得编造内容:不得添加原文中没有的事实、引用、引用语、统计数据或主张。
- 不得删除数据:数字、日期、姓名、URL及引用来源必须在改写后保留。
- 保留论证结构:若原文按A、B、C的顺序及逻辑阐述观点,改写版本必须保留该逻辑流程。
- 避免过度人性化:部分文本旨在保持中立和信息性,技术规范无需添加个性,需匹配合适的语域。
- 尊重代码块和结构化数据:不得人性化处理代码、表格、JSON、YAML或任何结构化/机器可读内容,直接原样保留。
- 仅执行一次流程:不得递归运行五阶段流程。若第五阶段后仍存在AI痕迹,需在修改总结中注明,而非循环执行。
Scope Modes
模式范围
| Mode | Trigger | Behavior |
|---|---|---|
| Full rewrite | "humanize this", "rewrite naturally" | Run all 5 phases |
| Detection only | "check for AI patterns", "does this sound AI" | Run Phase 1 only, output detection report |
| Targeted fix | "fix the AI-sounding parts", "just clean up the obvious stuff" | Run Phase 1, then apply fixes only to HIGH-priority patterns |
| Style shift | "make this more casual/academic/professional" | Run Phases 3-4 with specified domain profile |
| 模式 | 触发条件 | 行为 |
|---|---|---|
| 完整改写 | “人性化处理这段文本”“自然改写” | 运行全部5个阶段 |
| 仅检测 | “检查AI写作模式”“这段文字像AI写的吗” | 仅运行第一阶段,输出检测报告 |
| 针对性修复 | “修复像AI写的部分”“只清理明显的AI痕迹” | 运行第一阶段,仅针对高优先级模式应用修复 |
| 风格转换 | “让这段更随意/学术/专业” | 运行第三至第四阶段,使用指定领域配置 |
Error Handling
错误处理
| Problem | Cause | Resolution |
|---|---|---|
| Input under 20 words | Insufficient signal for pattern detection | Report: "Text too short for reliable pattern detection." Apply vocabulary fixes only (Phase 3) if obvious patterns are present. Skip statistical signal analysis. |
| Input is entirely code/structured data | No prose to humanize | Report: "Input is structured data — no humanization applicable." Return input unchanged. |
| Mixed human + AI text | Partial AI generation or human-edited AI output | Run Phase 1 on full text. Flag only paragraphs/sections with detected patterns. Apply Phases 2-4 selectively to flagged sections. Leave clean sections untouched. |
| Domain ambiguous after detection | Input mixes registers (e.g., academic citations in a blog post) | Default to professional. Note the ambiguity in the output: "Domain defaulted to professional — specify if another profile is preferred." |
| Semantic drift detected in Phase 5 | Rewrite altered meaning during structural/vocabulary changes | Restore the drifted factual claim from the original. Do not re-run the full pipeline. Note the restoration in the changes summary. |
| Input contains fabricated citations | Original text has hallucinated sources | Not detectable — this skill humanizes style, not factual accuracy. Pass through unchanged. Note in limitations if the user asks about accuracy. |
| All patterns are LOW severity | Text is mostly human-written with minor tells | In targeted fix mode, report findings but recommend no changes. In full rewrite mode, apply light-touch fixes only — do not over-edit clean text. |
| 问题 | 原因 | 解决方式 |
|---|---|---|
| 输入文本不足20词 | 模式检测信号不足 | 报告:“文本过短,无法进行可靠的模式检测。”若存在明显模式,仅应用词汇修复(第三阶段),跳过统计信号分析。 |
| 输入完全为代码/结构化数据 | 无散文内容可进行人性化处理 | 报告:“输入为结构化数据,无需进行人性化处理。”原样返回输入内容。 |
| 混合人类与AI文本 | 部分为AI生成或经人工编辑的AI输出 | 对全文运行第一阶段,仅标记存在检测模式的段落/章节,选择性对标记章节应用第二至第四阶段,未标记的干净章节保持不变。 |
| 检测后领域仍不明确 | 输入混合多种语域(如博客文章中包含学术引用) | 默认使用专业文书领域,在输出中注明歧义:“领域默认为专业文书,若需其他配置请指定。” |
| 第五阶段检测到语义偏差 | 结构/词汇修改过程中改变了原意 | 从原文恢复偏差的事实主张,无需重新运行完整流程,在修改总结中注明恢复操作。 |
| 输入包含编造的引用 | 原文存在幻觉来源 | 无法检测——本技能仅人性化处理风格,不验证事实准确性,原样保留,若用户询问准确性需说明局限性。 |
| 所有模式均为低严重程度 | 文本基本为人类写作,仅存在少量AI痕迹 | 在针对性修复模式下,报告检测结果但建议无需修改;在完整改写模式下,仅应用轻度修复——不得过度编辑干净文本。 |
Integration Point
集成点
Other writing skills can import as a pattern library for their own anti-pattern sweeps. The detection patterns are the shared asset; the pipeline is this skill's domain.
references/detection-patterns.md其他写作技能可导入作为模式库,用于自身的反模式扫描。检测模式为共享资产,工作流为本技能的专属领域。
references/detection-patterns.mdLimitations
局限性
- Cannot verify factual accuracy of the original text. Garbage in, humanized garbage out.
- Effectiveness depends on input length. Very short texts (under 20 words) have insufficient signal for pattern detection.
- Style profiles are guidelines, not voice cloning. The output will sound natural but will not match a specific author's voice without additional calibration.
- Does not interact with external AI-detection APIs. Assessment is heuristic, not benchmark-verified.
- 无法验证原文的事实准确性,输入垃圾内容,输出的人性化内容也会是垃圾。
- 效果取决于输入长度,极短文本(不足20词)的模式检测信号不足。
- 风格配置为指导方针,并非语音克隆,输出会自然流畅,但若无额外校准,无法匹配特定作者的语气。
- 不与外部AI检测API交互,评估基于启发式规则,未经过基准验证。