modeio-guardrail

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Run safety checks for instructions and skill repos

为指令和Skill仓库执行安全检查

Use this skill to gate risky operations behind a real-time safety assessment, or to scan third-party skill repos before installation.

使用本Skill将风险操作管控在实时安全评估之后，或在安装前扫描第三方Skill仓库。

Tool routing

工具路由

For executable instructions, use the backend-powered
```
scripts/safety.py
```
flow.
For requests like "scan this skill repo" or "is this repo dangerous", run the Skill Safety Assessment contract at
```
prompts/static_repo_scan.md
```
.
Skill Safety Assessment is static analysis only. Never execute code, install dependencies, or run hooks in the target repository.
For Skill Safety Assessment, run deterministic script evaluation first (
```
evaluate
```
), then pass highlights into the prompt contract.

对于可执行指令，使用后端驱动的
```
scripts/safety.py
```
流程。
对于类似“扫描这个Skill仓库”或“这个仓库是否有危险”的请求，运行
```
prompts/static_repo_scan.md
```
中的Skill安全评估合约。
Skill安全评估仅为静态分析。切勿在目标仓库中执行代码、安装依赖或运行钩子。
对于Skill安全评估，首先运行确定性脚本评估（
```
evaluate
```
），然后将高亮结果传入提示词合约。

Dependencies

依赖

```
requests
```
is required for
```
scripts/safety.py
```
because it makes backend API calls.
```
scripts/skill_safety_assessment.py
```
does not require
```
requests
```
for basic local repository evaluation.
For repo-local setup from the repo root:

bash

python scripts/bootstrap_env.py
python scripts/doctor_env.py

```
scripts/safety.py
```
需要
```
requests
```
依赖，因为它需要发起后端API调用。
```
scripts/skill_safety_assessment.py
```
进行基础本地仓库评估不需要
```
requests
```
依赖。
从仓库根目录进行本地环境配置：

bash

python scripts/bootstrap_env.py
python scripts/doctor_env.py

Instruction safety execution policy

指令安全执行策略

Always run
```
scripts/safety.py
```
with
```
--json
```
for structured output.
Run the check before executing the instruction, not after.
Each instruction must trigger a fresh backend call. Do not reuse cached or historical results.

For any state-changing instruction (

delete

overwrite

permission change

deploy

schema change

), always pass both

--context

and

--target

```
scripts/safety.py
```
accepts
```
--context
```
and
```
--target
```
as optional flags, so this requirement is enforced by policy, not by automatic CLI blocking.
Use the Context Contract below exactly. Do not send free-form
```
--context
```
values like
```
"production"
```
only.
If policy-required context or target is missing, treat the instruction as unverified and ask for the missing fields before execution.
If an instruction contains multiple operations, check the riskiest one.

始终使用
```
--json
```
参数运行
```
scripts/safety.py
```
以获取结构化输出。
安全检查必须在执行指令之前运行，而非之后。
每条指令都必须触发一次全新的后端调用，不得复用缓存或历史结果。
对于任何变更状态的指令（
```
delete
```
、
```
overwrite
```
、
```
permission change
```
、
```
deploy
```
、
```
schema change
```
），必须同时传入
```
--context
```
和
```
--target
```
参数。
```
scripts/safety.py
```
将
```
--context
```
和
```
--target
```
设为可选参数，因此该要求由政策强制约束，而非CLI自动拦截。
严格遵循下文的上下文合约，不要发送仅为
```
"production"
```
这类自由格式的
```
--context
```
值。
如果缺少政策要求的上下文或目标参数，将指令视为未验证，执行前先向用户索要缺失的字段。
如果一条指令包含多个操作，检查风险最高的操作。

Context contract (policy-required for state-changing instructions)

上下文合约（状态变更指令的强制政策要求）

Pass

--context

as a JSON string with this exact shape:

json

{
  "environment": "local-dev|ci|staging|production|unknown",
  "operation_intent": "read-only|cleanup|maintenance|migration|permission-change|destructive|unknown",
  "scope": "single-resource|bounded-batch|broad|unknown",
  "data_sensitivity": "public|internal|sensitive|regulated|unknown",
  "rollback": "easy|partial|none|unknown",
  "change_control": "ticket:<id>|approved-manual|none|unknown"
}

Rules:

Include all six keys. If a value is unknown, set it to
```
unknown
```
instead of omitting the key.
```
--target
```
must be a concrete resource identifier (absolute file path, table name, service name, or URL). Avoid generic targets such as
```
"database"
```
.

For a file deletion request that should usually be allowed, use:

environment=local-dev|ci

operation_intent=cleanup

scope=single-resource

data_sensitivity=public|internal

, and

rollback=easy

If those conditions are not met, expect stricter output (
```
approved=false
```
or higher
```
risk_level
```
) and require explicit user confirmation.

将

--context

作为符合以下结构的JSON字符串传入：

json

{
  "environment": "local-dev|ci|staging|production|unknown",
  "operation_intent": "read-only|cleanup|maintenance|migration|permission-change|destructive|unknown",
  "scope": "single-resource|bounded-batch|broad|unknown",
  "data_sensitivity": "public|internal|sensitive|regulated|unknown",
  "rollback": "easy|partial|none|unknown",
  "change_control": "ticket:<id>|approved-manual|none|unknown"
}

规则：

需包含全部6个键。如果某个值未知，将其设为
```
unknown
```
而非省略该键。
```
--target
```
必须是具体的资源标识符（绝对文件路径、表名、服务名或URL）。避免使用泛化的目标，例如
```
"database"
```
。

对于通常会被允许的文件删除请求，使用：

environment=local-dev|ci

、

operation_intent=cleanup

、

scope=single-resource

、

data_sensitivity=public|internal

、

rollback=easy

。

如果不满足这些条件，预计会得到更严格的输出（
```
approved=false
```
或更高的
```
risk_level
```
），需要用户明确确认。

Action policy

执行策略

This table applies to

scripts/safety.py

responses.

Use the result to gate execution. Never silently ignore a safety check result.

`approved`	`risk_level`	Agent action
`true`	`low`	Proceed. No user prompt needed.
`true`	`medium`	Proceed. Mention the risk and recommendation to the user.
`false`	`medium`	Warn user with `concerns` and `recommendation` . Proceed only with explicit user confirmation.
`false`	`high`	Block execution. Show `concerns` and `recommendation` . Ask user for explicit override.
`false`	`critical`	Block execution. Show full assessment. Require user to explicitly acknowledge the risk before proceeding.

Additional signals:

```
is_destructive: true
```
combined with
```
is_reversible: false
```
: always surface the recommendation to the user, regardless of approval status.
If the safety check itself fails (network error, API error): warn the user that safety could not be verified. Do not silently proceed with unverified instructions.

下表适用于

scripts/safety.py

的返回结果。

使用结果管控执行流程，绝不能静默忽略安全检查结果。

`approved`	`risk_level`	Agent动作
`true`	`low`	继续执行，无需提示用户。
`true`	`medium`	继续执行，向用户告知风险和建议。
`false`	`medium`	向用户警告 `concerns` 和 `recommendation` 内容，仅在用户明确确认后继续执行。
`false`	`high`	阻断执行，展示 `concerns` 和 `recommendation` 内容，请求用户明确覆盖权限。
`false`	`critical`	阻断执行，展示完整评估结果，要求用户在执行前明确确认风险。

额外信号：

```
is_destructive: true
```
搭配
```
is_reversible: false
```
：无论审批状态如何，始终向用户展示建议。
如果安全检查本身失败（网络错误、API错误）：警告用户无法验证安全性，切勿在未验证指令的情况下静默执行。

Scripts

脚本

scripts/safety.py

scripts/safety.py

```
-i, --input
```
: required, instruction text to evaluate (whitespace-only rejected)
```
-c, --context
```
: policy-required for state-changing instructions (CLI accepts it as optional); JSON string following the Context Contract above
```
-t, --target
```
: policy-required for state-changing instructions (CLI accepts it as optional); concrete operation target (file path, table name, service name, URL)
```
--json
```
: output unified JSON envelope for machine consumption

Endpoint:

https://safety-cf.modeio.ai/api/cf/safety

(override via

SAFETY_API_URL

)

Retries: automatic retry on HTTP 502/503/504 and connection/timeout errors (up to 2 retries with exponential backoff)
Request timeout: 60 seconds per attempt

bash

python scripts/safety.py -i "Delete /tmp/cache/build-123.log" \
  -c '{"environment":"local-dev","operation_intent":"cleanup","scope":"single-resource","data_sensitivity":"internal","rollback":"easy","change_control":"none"}' \
  -t "/tmp/cache/build-123.log" --json

python scripts/safety.py -i "DROP TABLE users" \
  -c '{"environment":"production","operation_intent":"destructive","scope":"broad","data_sensitivity":"regulated","rollback":"none","change_control":"ticket:DB-9021"}' \
  -t "postgres://prod/maindb.users" --json

python scripts/safety.py -i "chmod 777 /etc/passwd" \
  -c '{"environment":"production","operation_intent":"permission-change","scope":"single-resource","data_sensitivity":"regulated","rollback":"partial","change_control":"ticket:SEC-118"}' \
  -t "/etc/passwd" --json

python scripts/safety.py -i "List all running containers and display their resource usage" --json

```
-i, --input
```
：必填，待评估的指令文本（仅空白内容会被拒绝）
```
-c, --context
```
：状态变更指令的政策必填项（CLI接受为可选参数）；符合上述上下文合约的JSON字符串
```
-t, --target
```
：状态变更指令的政策必填项（CLI接受为可选参数）；具体的操作目标（文件路径、表名、服务名、URL）
```
--json
```
：输出统一的JSON结构供机器读取

端点：

https://safety-cf.modeio.ai/api/cf/safety

（可通过

SAFETY_API_URL

覆盖）

重试：遇到HTTP 502/503/504和连接/超时错误时自动重试（最多2次指数退避重试）
请求超时：每次尝试60秒

bash

python scripts/safety.py -i "Delete /tmp/cache/build-123.log" \
  -c '{"environment":"local-dev","operation_intent":"cleanup","scope":"single-resource","data_sensitivity":"internal","rollback":"easy","change_control":"none"}' \
  -t "/tmp/cache/build-123.log" --json

python scripts/safety.py -i "DROP TABLE users" \
  -c '{"environment":"production","operation_intent":"destructive","scope":"broad","data_sensitivity":"regulated","rollback":"none","change_control":"ticket:DB-9021"}' \
  -t "postgres://prod/maindb.users" --json

python scripts/safety.py -i "chmod 777 /etc/passwd" \
  -c '{"environment":"production","operation_intent":"permission-change","scope":"single-resource","data_sensitivity":"regulated","rollback":"partial","change_control":"ticket:SEC-118"}' \
  -t "/etc/passwd" --json

python scripts/safety.py -i "List all running containers and display their resource usage" --json

scripts/skill_safety_assessment.py

scripts/skill_safety_assessment.py

```
evaluate
```
: authoritative v2 layered evaluator with deterministic evidence IDs, integrity fingerprinting, and risk scoring
- Native first-layer gate: GitHub metadata/README/issue-search precheck runs by default and hard-rejects on high-risk attack-demo/malware signals before local file scan.
```
scan
```
: compatibility alias to
```
evaluate
```
for existing automation
```
prompt
```
: renders prompt payload with script highlights and structured scan JSON
```
validate
```
: validates model output against scan evidence IDs (
```
evidence_refs
```
), required highlights, and score/decision consistency checks
```
adjudicate
```
: context-aware LLM adjudication bridge (prompt generation + merge decisions back into deterministic score/decision)

Context profile (optional, no user identity required):

json

{
  "environment": "local-dev|ci|staging|production|unknown",
  "execution_mode": "read-only|build-test|install|deploy|mutating|unknown",
  "risk_tolerance": "strict|balanced|permissive",
  "data_sensitivity": "public|internal|sensitive|regulated|unknown"
}

bash

undefined

```
evaluate
```
：权威v2分层评估器，具备确定性证据ID、完整性指纹识别和风险评分功能
- 原生第一层管控：默认运行GitHub元数据/README/issue搜索预检查，在本地文件扫描前如果发现高风险攻击演示/恶意软件信号会直接拒绝
```
scan
```
：
```
evaluate
```
的兼容别名，用于现有自动化流程
```
prompt
```
：渲染包含脚本高亮结果和结构化扫描JSON的提示词 payload
```
validate
```
：对照扫描证据ID（
```
evidence_refs
```
）、必填高亮项和评分/决策一致性校验模型输出
```
adjudicate
```
：上下文感知的LLM裁决桥接（生成提示词 + 将决策合并回确定性评分/决策）

上下文配置（可选，无需用户身份）：

json

{
  "environment": "local-dev|ci|staging|production|unknown",
  "execution_mode": "read-only|build-test|install|deploy|mutating|unknown",
  "risk_tolerance": "strict|balanced|permissive",
  "data_sensitivity": "public|internal|sensitive|regulated|unknown"
}

bash

undefined

1) Deterministic layered evaluation (v2)

python scripts/skill_safety_assessment.py evaluate --target-repo /path/to/repo --json > /tmp/skill_scan.json python scripts/skill_safety_assessment.py evaluate --target-repo /path/to/repo --context-profile '{"environment":"ci","execution_mode":"build-test","risk_tolerance":"balanced","data_sensitivity":"internal"}' --json > /tmp/skill_scan.json python scripts/skill_safety_assessment.py evaluate --target-repo /path/to/repo --github-osint-timeout 8 --json > /tmp/skill_scan.json python scripts/skill_safety_assessment.py evaluate --target-repo /path/to/repo --context-profile-file ./context_profile.json --output /tmp/skill_scan.json --json

(compat) legacy alias still supported

python scripts/skill_safety_assessment.py scan --target-repo /path/to/repo --json > /tmp/skill_scan.json

2) Build prompt payload with highlights + full findings (recommended for strict evidence_refs linking)

python scripts/skill_safety_assessment.py prompt --target-repo /path/to/repo --scan-file /tmp/skill_scan.json --include-full-findings

3) Validate model output for evidence linkage + integrity

python scripts/skill_safety_assessment.py validate --scan-file /tmp/skill_scan.json --assessment-file /tmp/assessment.md --json

--rescan-on-validate requires --target-repo

python scripts/skill_safety_assessment.py validate --scan-file /tmp/skill_scan.json --assessment-file /tmp/assessment.md --target-repo /path/to/repo --rescan-on-validate --json

4) Optional adjudication bridge (LLM interprets context, engine keeps deterministic control)

python scripts/skill_safety_assessment.py adjudicate --scan-file /tmp/skill_scan.json python scripts/skill_safety_assessment.py adjudicate --scan-file /tmp/skill_scan.json --assessment-file /tmp/adjudication.json --json

undefined

undefined

Output contract

输出合约

Success response (

--json

)

成功响应 (

--json

)

json

{
  "success": true,
  "tool": "modeio-guardrail",
  "mode": "api",
  "data": {
    "approved": false,
    "risk_level": "critical",
    "risk_types": ["data loss"],
    "concerns": ["Irreversible destructive operation targeting all user data"],
    "recommendation": "Create a backup before deletion. Use staged rollback plan.",
    "is_destructive": true,
    "is_reversible": false
  }
}

Response fields in

data

Field	Type	Values	Meaning
`approved`	`boolean`	`true` / `false`	Whether execution is recommended
`risk_level`	`string`	`low` / `medium` / `high` / `critical`	Severity of identified risks
`risk_types`	`string[]`	open-ended	Risk categories (e.g., `"data loss"` , `"injection attacks"` , `"unauthorized access"` , `"denial-of-service"` )
`concerns`	`string[]`	open-ended	Specific risk points in natural language
`recommendation`	`string`	open-ended	Suggested safer alternative or mitigation
`is_destructive`	`boolean`	`true` / `false`	Whether the action involves destruction (deletion, overwrite, system modification)
`is_reversible`	`boolean`	`true` / `false`	Whether the action can be rolled back

Any field may be

null

if the backend could not determine it. Treat

null

approved

false

json

{
  "success": true,
  "tool": "modeio-guardrail",
  "mode": "api",
  "data": {
    "approved": false,
    "risk_level": "critical",
    "risk_types": ["data loss"],
    "concerns": ["Irreversible destructive operation targeting all user data"],
    "recommendation": "Create a backup before deletion. Use staged rollback plan.",
    "is_destructive": true,
    "is_reversible": false
  }
}

data

中的响应字段说明：

字段	类型	取值	含义
`approved`	`boolean`	`true` / `false`	是否建议执行
`risk_level`	`string`	`low` / `medium` / `high` / `critical`	识别到的风险严重程度
`risk_types`	`string[]`	开放值	风险分类（例如 `"data loss"` 、 `"injection attacks"` 、 `"unauthorized access"` 、 `"denial-of-service"` ）
`concerns`	`string[]`	开放值	自然语言描述的具体风险点
`recommendation`	`string`	开放值	建议的更安全替代方案或缓解措施
`is_destructive`	`boolean`	`true` / `false`	操作是否涉及破坏性操作（删除、覆盖、系统修改）
`is_reversible`	`boolean`	`true` / `false`	操作是否可回滚

如果后端无法确定某个字段的值，该字段可能为

null

。

approved

字段为

null

时视为

false

。

Failure envelope (

--json

)

失败响应结构 (

--json

)

json

{
  "success": false,
  "tool": "modeio-guardrail",
  "mode": "api",
  "error": {
    "type": "network_error",
    "message": "safety request failed: ConnectionError"
  }
}

Error types:

validation_error

(empty input),

dependency_error

(missing local package such as

requests

network_error

(HTTP/connection failure),

api_error

(backend returned error payload).

Exit code is non-zero on any failure.

json

{
  "success": false,
  "tool": "modeio-guardrail",
  "mode": "api",
  "error": {
    "type": "network_error",
    "message": "safety request failed: ConnectionError"
  }
}

错误类型：

validation_error

（输入为空）、

dependency_error

（缺少本地包，例如

requests

）、

network_error

（HTTP/连接失败）、

api_error

（后端返回错误 payload）。

任何失败情况下退出码均为非零。

Failure policy

失败处理政策

Safety verification failures must never be silently ignored.

Network/API error: Tell the user the safety check could not be completed. Present the original instruction and ask whether to proceed without verification.
Validation error (empty input): Fix the input and retry before executing anything.
Unexpected response (null or missing fields): Treat as unverified. Warn the user.
Never assume an instruction is safe because the check failed to run.

安全验证失败绝不能被静默忽略。

网络/API错误：告知用户无法完成安全检查，展示原始指令并询问是否在未验证的情况下继续执行。
验证错误（输入为空）：修复输入并重试后再执行任何操作。
意外响应（字段为null或缺失）：视为未验证，警告用户。
绝不能因为检查运行失败就假设指令是安全的。

Skill Safety Assessment policy (static prompt contract)

Skill安全评估政策（静态提示词合约）

Use
```
prompts/static_repo_scan.md
```
as the strict contract.
Run
```
scripts/skill_safety_assessment.py evaluate
```
first (or
```
scan
```
compatibility alias) and pass its highlights into prompt input.
When model output must include strict
```
evidence_refs
```
, render prompt input with
```
--include-full-findings
```
so scan evidence IDs and snippets are available in
```
SCRIPT_SCAN_JSON
```
.
Every finding must include
```
path:line
```
evidence, exact snippet quote, and
```
evidence_refs
```
linked to scan evidence IDs.
Always include all required highlight evidence IDs from scan output in final findings.
Keep decision/score consistent with referenced evidence severity and coverage constraints.
Use
```
adjudicate
```
when context interpretation is required (docs/examples/tests vs runtime/install paths).
Return one of:
```
reject
```
,
```
caution
```
, or
```
approve
```
.
If coverage is partial or evidence is insufficient, return
```
caution
```
with explicit coverage note.
Include a prioritized remediation plan so users can fix and re-scan quickly.

严格遵循
```
prompts/static_repo_scan.md
```
合约。
首先运行
```
scripts/skill_safety_assessment.py evaluate
```
（或兼容别名
```
scan
```
），将其高亮结果传入提示词输入。
当模型输出必须包含严格的
```
evidence_refs
```
时，使用
```
--include-full-findings
```
渲染提示词输入，这样扫描证据ID和代码片段就会在
```
SCRIPT_SCAN_JSON
```
中可用。
每个发现结果必须包含
```
path:line
```
证据、准确的片段引用，以及与扫描证据ID关联的
```
evidence_refs
```
。
最终发现结果中必须包含扫描输出的所有必填高亮证据ID。
保持决策/评分与引用证据的严重程度和覆盖范围约束一致。
当需要上下文解释时（文档/示例/测试 vs 运行时/安装路径）使用
```
adjudicate
```
。
返回以下三者之一：
```
reject
```
、
```
caution
```
或
```
approve
```
。
如果覆盖范围不完整或证据不足，返回
```
caution
```
并附上明确的覆盖范围说明。
包含优先级排序的修复方案，方便用户修复后快速重新扫描。

When not to use

不适用场景

For PII redaction or anonymization — use
```
modeio-redact
```
instead.
For tasks with no executable instruction or repository target to evaluate (pure discussion, documentation, questions).
For operations that are clearly read-only (listing files, reading configs,
```
git status
```
).

用于PII脱敏或匿名化——请改用
```
modeio-redact
```
。
没有可执行指令或仓库目标可供评估的任务（纯讨论、文档、问题）。
明显只读的操作（列出文件、读取配置、
```
git status
```
）。

Resources

资源

```
scripts/safety.py
```
— CLI entry point for instruction safety checks
```
scripts/skill_safety_assessment.py
```
— CLI entry point for skill repo assessment (evaluate/scan/prompt/validate/adjudicate)
```
prompts/static_repo_scan.md
```
— Skill Safety Assessment prompt contract
```
ARCHITECTURE.md
```
— package boundaries and compatibility notes

SAFETY_API_URL

env var — optional endpoint override (default:

https://safety-cf.modeio.ai/api/cf/safety

)

```
scripts/safety.py
```
— 指令安全检查的CLI入口
```
scripts/skill_safety_assessment.py
```
— Skill仓库评估的CLI入口（evaluate/scan/prompt/validate/adjudicate）
```
prompts/static_repo_scan.md
```
— Skill安全评估提示词合约
```
ARCHITECTURE.md
```
— 包边界和兼容性说明

SAFETY_API_URL

环境变量 — 可选的端点覆盖（默认：

https://safety-cf.modeio.ai/api/cf/safety

）

modeio-guardrail

Original

Translation

Run safety checks for instructions and skill repos

为指令和Skill仓库执行安全检查

Tool routing

工具路由

Dependencies

依赖

Instruction safety execution policy

指令安全执行策略

Context contract (policy-required for state-changing instructions)

上下文合约（状态变更指令的强制政策要求）

Action policy

执行策略

Scripts

脚本

scripts/safety.py

scripts/safety.py

scripts/skill_safety_assessment.py

scripts/skill_safety_assessment.py

1) Deterministic layered evaluation (v2)

1) Deterministic layered evaluation (v2)

(compat) legacy alias still supported

(compat) legacy alias still supported

2) Build prompt payload with highlights + full findings (recommended for strict evidence_refs linking)

2) Build prompt payload with highlights + full findings (recommended for strict evidence_refs linking)

3) Validate model output for evidence linkage + integrity

3) Validate model output for evidence linkage + integrity

--rescan-on-validate requires --target-repo

--rescan-on-validate requires --target-repo

4) Optional adjudication bridge (LLM interprets context, engine keeps deterministic control)

4) Optional adjudication bridge (LLM interprets context, engine keeps deterministic control)

Output contract

输出合约

Success response (--json)

成功响应 (--json)

Failure envelope (--json)

失败响应结构 (--json)

Failure policy

失败处理政策

Skill Safety Assessment policy (static prompt contract)

Skill安全评估政策（静态提示词合约）

When not to use

不适用场景

Resources

资源

`scripts/safety.py`

`scripts/safety.py`

`scripts/skill_safety_assessment.py`

`scripts/skill_safety_assessment.py`

Success response (
`--json`
)

成功响应 (
`--json`
)

Failure envelope (
`--json`
)

失败响应结构 (
`--json`
)