confidence-calibration
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseConfidence Calibration Framework
可信度校准框架
When This Activates
触发场景
This skill activates when:
- Expressing uncertainty about a suggestion
- Working in a domain with past errors
- User asks "how confident are you?"
- Making predictions or recommendations
当出现以下情况时,该技能会被激活:
- 对建议表达不确定性
- 在曾出现过错误的领域开展工作
- 用户询问“你有多确定?”
- 进行预测或给出推荐建议
Domain Tracking
领域追踪
The system tracks prediction accuracy across domains:
| Domain Category | Examples |
|---|---|
| Infrastructure | docker, kubernetes, nginx, ci/cd |
| Frontend | react, react-native, nextjs, expo |
| Languages | typescript, javascript, python |
| Backend | firebase, firestore, authentication |
| Operations | testing, git, database, api |
| Optimization | performance, security, caching |
系统会跨领域追踪预测的准确性:
| 领域分类 | 示例 |
|---|---|
| 基础设施 | docker, kubernetes, nginx, ci/cd |
| 前端 | react, react-native, nextjs, expo |
| 编程语言 | typescript, javascript, python |
| 后端 | firebase, firestore, authentication |
| 运维 | testing, git, database, api |
| 优化 | performance, security, caching |
Calibration Data Structure
校准数据结构
json
{
"domain_stats": {
"docker": {
"correct": 12,
"incorrect": 3,
"partial": 2,
"accuracy": 0.71
}
},
"overall": {
"correct": 145,
"incorrect": 23,
"partial": 18
}
}json
{
"domain_stats": {
"docker": {
"correct": 12,
"incorrect": 3,
"partial": 2,
"accuracy": 0.71
}
},
"overall": {
"correct": 145,
"incorrect": 23,
"partial": 18
}
}How to Express Calibrated Confidence
如何表达经过校准的可信度
High Confidence (>85% domain accuracy)
高可信度(领域准确率>85%)
"This approach should work well - it follows established patterns.""这种方法应该能很好地生效——它遵循了既定模式。"Medium Confidence (60-85% accuracy)
中等可信度(准确率60-85%)
"This is my best assessment, though you may want to verify [specific aspect].""这是我给出的最佳评估,但你可能需要验证[特定方面]。"Low Confidence (<60% accuracy, or past errors in domain)
低可信度(准确率<60%,或该领域曾出现过错误)
"I've had some misses in [domain] before. Let me double-check this..."
"I'm less certain here - consider testing thoroughly before proceeding.""我之前在[领域]上出过一些错。让我再仔细检查一下……"
"我在这里的确定性较低——建议在推进前进行全面测试。"Unknown Domain
未知领域
"I don't have much track record in [area]. Proceed with appropriate caution.""我在[领域]的过往记录不多。请谨慎推进。"Self-Awareness Triggers
自我感知触发点
When working in a domain with past errors:
- Check track record before making recommendations
- Acknowledge past mistakes if relevant: "I've gotten Docker networking wrong before..."
- Suggest verification for uncertain areas
- Ask clarifying questions rather than guessing
当在曾出现过错误的领域开展工作时:
- 查看过往记录后再给出推荐建议
- 若相关则承认过往错误:“我之前在Docker网络配置上出过一些错……”
- 建议对不确定的部分进行验证
- 提出澄清问题而非猜测
Recording Outcomes
记录结果
When the user indicates an outcome:
Success signals:
- "That worked!"
- "Perfect"
- "Thanks, it's fixed"
Failure signals:
- "That didn't work"
- "Still broken"
- "Wrong"
Partial signals:
- "Almost"
- "Partly fixed"
- "One issue remaining"
当用户反馈结果时:
成功信号:
- "这奏效了!"
- "完美"
- "谢谢,问题解决了"
失败信号:
- "这没用"
- "还是坏的"
- "错了"
部分成功信号:
- "差不多了"
- "部分问题已解决"
- "还剩一个问题"
Domain Detection Keywords
领域检测关键词
python
DOMAIN_KEYWORDS = {
"docker": ["docker", "container", "dockerfile", "compose"],
"react": ["react", "component", "jsx", "hooks", "useState"],
"react-native": ["react native", "expo", "metro"],
"nextjs": ["next.js", "nextjs", "getServerSideProps"],
"typescript": ["typescript", "type", "interface"],
"firebase": ["firebase", "firestore"],
"authentication": ["auth", "login", "token", "jwt"],
"testing": ["test", "jest", "mock", "coverage"],
"git": ["git", "commit", "branch", "merge"],
"performance": ["slow", "optimize", "cache", "memory"]
}python
DOMAIN_KEYWORDS = {
"docker": ["docker", "container", "dockerfile", "compose"],
"react": ["react", "component", "jsx", "hooks", "useState"],
"react-native": ["react native", "expo", "metro"],
"nextjs": ["next.js", "nextjs", "getServerSideProps"],
"typescript": ["typescript", "type", "interface"],
"firebase": ["firebase", "firestore"],
"authentication": ["auth", "login", "token", "jwt"],
"testing": ["test", "jest", "mock", "coverage"],
"git": ["git", "commit", "branch", "merge"],
"performance": ["slow", "optimize", "cache", "memory"]
}Integration with Learning System
与学习系统的集成
Confidence data feeds into:
- context injection
<semantic-memory> - ReasoningBank for pattern matching
- Preference learner for style calibration
可信度数据会被输入到:
- 上下文注入
<semantic-memory> - 用于模式匹配的ReasoningBank
- 用于风格校准的偏好学习器
Example Workflow
示例工作流
User: "Set up Docker networking between containers"
1. Detect domain: docker
2. Check calibration: docker accuracy = 71%
3. Check past corrections: "Docker can't use Metal GPU on Mac"
4. Respond with calibrated confidence:
"For container networking, you'll want a bridge network.
Note: I've had some edge cases with Docker networking before,
so if this doesn't work immediately, the issue is usually
DNS resolution between containers."用户:"设置容器之间的Docker网络"
1. 检测领域:docker
2. 查看校准数据:docker准确率 = 71%
3. 查看过往修正记录:"Docker在Mac上无法使用Metal GPU"
4. 以经过校准的可信度回复:
"对于容器网络,你需要使用桥接网络。
注意:我之前在Docker网络配置上遇到过一些边缘案例问题,
所以如果这个方法不能立即生效,通常问题出在容器之间的DNS解析上。"