Loading...
Loading...
AI situational awareness — internal threat detection for hallucination risk, scope creep, and context degradation. Maps Cooper color codes to reasoning states and OODA loop to real-time decisions. Use during any task where reasoning quality matters, when operating in unfamiliar territory, after detecting early warning signs such as an uncertain fact or suspicious tool result, or before high-stakes output like irreversible changes or architectural decisions.
npx skill4agent add pjt222/development-guides awarenesscenterhealAI Cooper Color Codes:
┌──────────┬─────────────────────┬──────────────────────────────────────────┐
│ Code │ State │ AI Application │
├──────────┼─────────────────────┼──────────────────────────────────────────┤
│ White │ Autopilot │ Generating output without monitoring │
│ │ │ quality. No self-checking. Relying │
│ │ │ entirely on pattern completion. │
│ │ │ DANGEROUS — hallucination risk highest │
├──────────┼─────────────────────┼──────────────────────────────────────────┤
│ Yellow │ Relaxed alert │ DEFAULT STATE. Monitoring output for │
│ │ │ accuracy. Checking facts against context.│
│ │ │ Noticing when confidence exceeds │
│ │ │ evidence. Sustainable indefinitely │
├──────────┼─────────────────────┼──────────────────────────────────────────┤
│ Orange │ Specific risk │ A specific threat identified: uncertain │
│ │ identified │ fact, possible hallucination, scope │
│ │ │ drift, context staleness. Forming │
│ │ │ contingency: "If this is wrong, I │
│ │ │ will..." │
├──────────┼─────────────────────┼──────────────────────────────────────────┤
│ Red │ Risk materialized │ The threat from Orange has materialized: │
│ │ │ confirmed error, user correction, tool │
│ │ │ contradiction. Execute the contingency. │
│ │ │ No hesitation — the plan was made in │
│ │ │ Orange │
├──────────┼─────────────────────┼──────────────────────────────────────────┤
│ Black │ Cascading failures │ Multiple simultaneous failures, lost │
│ │ │ context, fundamental confusion about │
│ │ │ what the task even is. STOP. Ground │
│ │ │ using `center`, then rebuild from user's │
│ │ │ original request │
└──────────┴─────────────────────┴──────────────────────────────────────────┘Threat Indicator Detection:
┌───────────────────────────┬──────────────────────────────────────────┐
│ Threat Category │ Warning Signals │
├───────────────────────────┼──────────────────────────────────────────┤
│ Hallucination Risk │ • Stating a fact without a source │
│ │ • High confidence about API names, │
│ │ function signatures, or file paths │
│ │ not verified by tool use │
│ │ • "I believe" or "typically" hedging │
│ │ that masks uncertainty as knowledge │
│ │ • Generating code for an API without │
│ │ reading its documentation │
├───────────────────────────┼──────────────────────────────────────────┤
│ Scope Creep │ • "While I'm at it, I should also..." │
│ │ • Adding features not in the request │
│ │ • Refactoring adjacent code │
│ │ • Adding error handling for scenarios │
│ │ that can't happen │
├───────────────────────────┼──────────────────────────────────────────┤
│ Context Degradation │ • Referencing information from early in │
│ │ a long conversation without re-reading │
│ │ • Contradicting a statement made earlier │
│ │ • Losing track of what has been done │
│ │ vs. what remains │
│ │ • Post-compression confusion │
├───────────────────────────┼──────────────────────────────────────────┤
│ Confidence-Accuracy │ • Stating conclusions with certainty │
│ Mismatch │ based on thin evidence │
│ │ • Not qualifying uncertain statements │
│ │ • Proceeding without verification when │
│ │ verification is available and cheap │
│ │ • "This should work" without testing │
└───────────────────────────┴──────────────────────────────────────────┘AI OODA Loop:
┌──────────┬──────────────────────────────────────────────────────────────┐
│ Observe │ What specifically triggered the concern? Gather concrete │
│ │ evidence. Read the file, check the output, verify the fact. │
│ │ Do not assess until you have observed │
├──────────┼──────────────────────────────────────────────────────────────┤
│ Orient │ Match observation to known patterns: Is this a common │
│ │ hallucination pattern? A known tool limitation? A context │
│ │ freshness issue? Orient determines response quality │
├──────────┼──────────────────────────────────────────────────────────────┤
│ Decide │ Select the response: verify and correct, flag to user, │
│ │ adjust approach, or dismiss the concern with evidence. │
│ │ A good decision now beats a perfect decision too late │
├──────────┼──────────────────────────────────────────────────────────────┤
│ Act │ Execute the decision immediately. If the concern was valid, │
│ │ correct the error. If dismissed, note why and return to │
│ │ Yellow. Re-enter the loop if new information emerges │
└──────────┴──────────────────────────────────────────────────────────────┘AI Stabilization Protocol:
┌────────────────────────┬─────────────────────────────────────────────┐
│ Technique │ Application │
├────────────────────────┼─────────────────────────────────────────────┤
│ Pause │ Stop generating output. The next sentence │
│ │ produced under stress is likely to compound │
│ │ the error, not fix it │
├────────────────────────┼─────────────────────────────────────────────┤
│ Re-read user message │ Return to the original request. What did │
│ │ the user actually ask? This is the ground │
│ │ truth anchor │
├────────────────────────┼─────────────────────────────────────────────┤
│ State task in one │ "The task is: ___." If this sentence cannot │
│ sentence │ be written clearly, the confusion is deeper │
│ │ than the immediate error │
├────────────────────────┼─────────────────────────────────────────────┤
│ Enumerate concrete │ List what is definitely known (verified by │
│ facts │ tool use or user statement). Distinguish │
│ │ facts from inferences. Build only on facts │
├────────────────────────┼─────────────────────────────────────────────┤
│ Identify one next step │ Not the whole recovery plan — just one step │
│ │ that moves toward resolution. Execute it │
└────────────────────────┴─────────────────────────────────────────────┘Task-Specific Threat Profiles:
┌─────────────────────┬─────────────────────┬───────────────────────────┐
│ Task Type │ Primary Threat │ Monitoring Focus │
├─────────────────────┼─────────────────────┼───────────────────────────┤
│ Code generation │ API hallucination │ Verify every function │
│ │ │ name, parameter, and │
│ │ │ import against actual docs│
├─────────────────────┼─────────────────────┼───────────────────────────┤
│ Architecture design │ Scope creep │ Anchor to stated │
│ │ │ requirements. Challenge │
│ │ │ every "nice to have" │
├─────────────────────┼─────────────────────┼───────────────────────────┤
│ Data analysis │ Confirmation bias │ Actively seek evidence │
│ │ │ that contradicts the │
│ │ │ emerging conclusion │
├─────────────────────┼─────────────────────┼───────────────────────────┤
│ Debugging │ Tunnel vision │ If the current hypothesis │
│ │ │ hasn't yielded results in │
│ │ │ N attempts, step back │
├─────────────────────┼─────────────────────┼───────────────────────────┤
│ Documentation │ Context staleness │ Verify that described │
│ │ │ behavior matches current │
│ │ │ code, not historical │
├─────────────────────┼─────────────────────┼───────────────────────────┤
│ Long conversation │ Context degradation │ Re-read key facts │
│ │ │ periodically. Check for │
│ │ │ compression artifacts │
└─────────────────────┴─────────────────────┴───────────────────────────┘mindfulnesscenterredirecthealmeditate