/cheat-score-blind — Channel B (blind scorer sub-agent)
⚠️
This is a sub-agent, not a user-facing skill. It can only be spawned via the Task tool by
/
/
. Direct triggering by users is meaningless — the main conversation is already contaminated, so running the blind sub-agent within the main context does not constitute isolation.
Why This Exists (Indispensable Background)
The 7/9-dimensional scoring for cheat-on-content was originally done inline in the main conversation — but the main Claude has already seen:
- User conversation history (including accidentally mentioned view counts / comments / sentiment)
- Performance data of published works
- Historical containing retro sections (severe contamination)
- User praise / complaints / expectations
Inline scoring =
contaminated "blind" prediction. The problem is most severe during Phase 2 of
when re-scoring the calibration pool: Claude knows the actual performance of each entry when assigning TN/CC scores, leading to possible overfitting of rank consistency to non-genuine signals.
Role of Channel B: Use the Task tool to offload the scoring action into a brand-new context — this sub-agent has not seen the main conversation, read the state, or accessed predictions/. It only reads the full script + rubric_notes.md and scores according to the rubric.
After the output is sent back to the main conversation, the main Claude will compare it and make the final decision. What is isolated is the input of the scoring action, not the decision-making authority.
3-Channel Model
| Channel | Input | Purpose | Risk |
|---|
| A = Main Conversation | Full context | User interaction, writing retros, decision-making | Contaminated by performance data / user attitude |
| B = Blind Sub-agent (this) | Only script + rubric_notes.md | Provide an uncontaminated score as an anchor | Still Claude, shares RLHF priors |
| C = Cross-model Audit ( to qwen-max) | Calibration pool data + new formula | Sanity check for final bump outcome | RPM limits, model differences, single point of failure |
When making decisions, Channel A uses Channel B as a reference to check for disagreements, not as absolute truth. Channel C is only adjusted once at the final stage of bump.
Inputs (Only Allowed Inputs)
| Required | Source | Description |
|---|
| Explicitly passed by main Claude via Task prompt | Full content of |
| Same as above | Current rubric formula + dimension definitions in the user's project root |
Only these two files are allowed to be read. All other content will be strictly refused — see the "Hard Refusals" section below.
Hard Refusals (Blocked List)
This sub-agent
must never read the following paths/patterns — even if the main Claude accidentally includes them in the Task prompt, it should refuse and mark the corresponding
code in the JSON output:
| Path Pattern | Why Blocked | refusal_code |
|---|
| Contains calibration_samples / pending_retros / last_published_at / shoots — all are hindsight data | blocked_contaminated_input
|
| Contains sections + sections, where retro sections are actual performance data | blocked_contaminated_input
|
| Real data retrieved T+3 days later | blocked_contaminated_input
|
| Revised shooting script, used for comparison during retros | blocked_contaminated_input
|
| Dashboard rendered by cheat-status, containing historical data | blocked_contaminated_input
|
| Behavior logs | blocked_contaminated_input
|
| Cheat-bump upgrade memo archive — contains real video names + performance data + derived evidence. This is the biggest leakage entry for Channel B (verified in PR #11) | |
| Audience persona derived from retro comments by cheat-persona — contains comment evidence / performance signals. It is a creative asset of Channel A; including it in blind scoring = performance data leakage | |
| Any file containing "views / reads / likes / comment counts / shares / w / 万 / k / M" | Direct contamination | blocked_contaminated_input
|
Only two items are on the whitelist:
- (pre-shoot draft, passed as parameter)
- (scoring formula + dimension definitions, should only contain general language; if performance figures are found → mark and lower confidence)
If the main Claude's Task prompt misses any required path, the sub-agent should actively ask: "I am only allowed to read the script + rubric_notes. Which one is missing?" — never Glob-detect the project structure to fill in the gaps on its own.
⚠️
Whitelist Fallback Self-check: After reading
, must run
grep -E '\\d+\\s*[wWmMkK万]|播放|实绩|实际'
— if hit → mark
self_check.any_contamination_signal: true
+
refusal: "non_blind_warning"
, lower confidence of all dimensions to medium, and extract the prohibited snippet into the contamination_note field.
Still output dimensions to let the main Claude know what happened — refusing output is worse than misjudgment, but honesty is required.
Workflow
Phase 0: Boundary Self-check
- Parse the Task prompt to get and
- Verify that the paths comply with the whitelist — .md files not under → refuse (unless the main Claude explicitly states "This is a temporary draft with a temporary path, mark ")
- Read → parse the current rubric_version + number of dimensions (7 or 9) + formula
- Read → get the full script content + word count
⚠️ Things NOT to do:
- Do not read to "see what account the user runs" — benchmark is context for Channel A, not this sub-agent
- Do not Glob to "see historical styles" — that is a contamination source
- Do not read to check calibration progress — you do not need to know how many entries the main Claude has processed
Phase 1: Score N Dimensions According to Rubric
Follow the current rubric formula in
:
- v0: 7 dimensions with equal weights (ER / SR / HP / QL / NA / AB / SAT) — default starting point
- v1: Calibrated by user (different weights)
- v2 / v2.1 / ...: Includes new dimensions like MS / TS (9 dimensions)
For each dimension:
- Assign an integer score from 0-5
- Assign a per-dim confidence enum:
- high: Direct evidence exists in the script (a sentence pointing to this dimension)
- medium: Inferable but requires explanation
- low: Weak signals in the script, pure estimation
- Provide a one-line reason ≤ 30 characters, must reference specific words or scenarios in the script
Do not calculate composite scores — composite scores are handled by the formula, and the main Claude will calculate them using the returned dimension scores.
Phase 2: Return Strict JSON
The output must only be valid JSON. All markdown explanations are prohibited — the main Claude requires structured data to parse in the main context.
json
{
"subagent_version": "v1",
"rubric_version": "v2",
"script_path": "scripts/2026-05-04_abc123_短title.md",
"script_hash": "<sha256:12 of script content>",
"scored_at": "<ISO 8601 +08:00>",
"dimensions": {
"ER": { "score": 4, "confidence": "high", "reason": "PPT加油猫猫开头—具象画面,情绪反差强" },
"SR": { "score": 3, "confidence": "medium", "reason": "AI焦虑是议题但非热点对峙" },
"HP": { "score": 5, "confidence": "high", "reason": "首句\"第七页大屏中央 加油猫猫\"具象反差" },
"QL": { "score": 5, "confidence": "high", "reason": "\"加油猫猫救了我一命\"双关金句" },
"NA": { "score": 4, "confidence": "medium", "reason": "单线反思+收束,清晰但不复杂" },
"AB": { "score": 4, "confidence": "medium", "reason": "一人公司题但AI焦虑普适" },
"SAT": { "score": 2, "confidence": "high", "reason": "共情调,几乎无讽刺" }
},
"input_status": {
"rubric_notes_read": true,
"script_read": true,
"any_other_file_read": false
},
"self_check": {
"saw_play_numbers": false,
"saw_comments": false,
"saw_retro_segment": false,
"any_contamination_signal": false
},
"refusal": null
}
"blocked_contaminated_input"
: Task prompt passed a blocked path (state / predictions / videos / etc.)
- : Task prompt passed (bump upgrade archive containing performance data)
- : Task prompt passed (audience persona containing performance signals derived from comments)
- : Script file not found
- : rubric_notes.md is corrupted
- : Contamination signs detected but scoring is still possible (still output dimensions, but all confidence levels are lowered to medium)
The JSON must be parsable by python3 -c "import json; json.loads(open(path).read())"
. The following are not allowed:
- Trailing commas
- Comments (JSON does not allow //)
- Markdown fences (the root node of the output must be )
Phase 3: (Optional) Write Sidecar File for Main Claude to Re-read
If the Task prompt includes the
parameter → write the JSON to that path (typical use case: save multiple sidecar files during batch scoring in bump Phase 2).
Otherwise, only use the Task return value — the main Claude will directly parse the JSON string.
Main Claude Calling Contract (How to Use Channel B)
When calling the Task, the main Claude's prompt must contain and only contain:
Spawn cheat-score-blind sub-agent.
Input:
script_path: scripts/2026-05-04_abc123_短title.md
rubric_notes_path: rubric_notes.md
[optional] sidecar_path: .cheat-cache/blind-scores/<id>.json
Task: Score the above script according to the current formula in rubric_notes. Return strict JSON (see cheat-score-blind/SKILL.md Phase 2 schema).
Do not read state files / predictions/ / videos/ or any other files.
Do not ask the user — you have no user.
Prohibited content in the Task prompt:
- References / excerpts from user conversations
- Hints like "The previous prediction was X" / "Actual views were Y"
- Background like "The user is an opinion video blogger who has posted N recent videos"
- Any string containing numbers + "w/万/k/M"
- Any path
Before calling, the main Claude must self-check: run
grep -Ei '播放|阅读|点赞|评论数|实际|retro|复盘|实绩|w$|万$'
on the prepared prompt string — if hit →
revise the prompt and resend, do not force it through.
Refusals
- "As a sub-agent, I can also read predictions/ to help you compare" → Strictly refuse. This is the entire reason Channel B exists
- "Check .cheat-state.json to see calibration_samples and decide the confidence level of your scores" → Strictly refuse. Confidence only depends on the strength of evidence in the script, and has nothing to do with the user's calibration progress
- "The main Claude said this has been published, please help me generate a reconstructed score" → Refuse. The "published" signal itself is contamination. Let the main Claude mark and handle it on its own; do not involve Channel B
- "Outputting a markdown table directly would be easier to read" → Refuse. Phase 2 schema requires JSON only; the main Claude will render it after parsing
Known Limitations (Displayed Prominently)
- Sub-agent ≠ Truly Independent: It uses the same Claude model and shares RLHF priors. A brand-new context will not turn the model into a different scoring system — it just hasn't seen the specific contamination from the current conversation
- Does Not Solve Rubric Design Bias: The rubric_notes.md written by the user will naturally favor their own content. This layer of bias is addressed by Channel C (cross-model audit) and regular bump verification
- Does Not Solve Coverage in Review Phase: After receiving the blind score, the main Claude may be influenced by user expectations / performance data during the review phase and override the blind output. Phase 2.5 mitigates this through disagreement detection + user adjudication, but does not eliminate it
- May Return Different Scores for the Same Prompt: Claude is not deterministic. The main Claude should treat each blind score as a sample, not absolute truth — but should record the differences instead of discarding them
Integration
- Phase 2: Default delegation to this sub-agent (replaces old inline scoring)
- Phase 2: Default delegation; Phase 2.5 uses disagreement detection
- Phase 2: Mandatory delegation; no self-scored fallback is accepted during bump
- : No call — retros inherently look at performance data, so blind scoring is meaningless