privacy-research-engineer-safeguards

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Privacy Research Engineer, Safeguards

隐私研究工程师（防护方向）

When to Use

适用场景

Frame privacy research questions for safeguard and moderation stacks
Design PII detection/redaction benchmarks — precision/recall, re-identification risk
Evaluate de-identification techniques (mask, tokenize, synthetic replace) on realistic prompts
Study memorization and extraction — can models or logs leak user content?
Curate privacy-sensitive datasets — synthetic data, consent boundaries, labeling rules
Run ablations on detector architecture, threshold, or post-processing
Define logging minimization — what safety systems may store vs must discard
Write research memos with privacy–utility trade-offs and production recommendations
Specify promotion criteria for privacy mitigations before prod rollout

为防护与审核栈构建隐私研究问题框架
设计PII检测/编辑基准——精确率/召回率、重识别风险
在真实提示语上评估去识别技术（掩码、 token化、合成替换）
研究记忆与提取问题——模型或日志是否可能泄露用户内容？
整理隐私敏感数据集——合成数据、同意边界、标注规则
对检测器架构、阈值或后处理流程进行消融实验
定义日志留存最小化规则——安全系统可存储与必须丢弃的内容
撰写包含隐私-效用权衡分析及生产落地建议的研究备忘录
制定隐私缓解措施在上线前的推广准入标准

When NOT to Use

不适用场景

Audit evidence pipelines for GDPR/SOC 2 attestations →
```
compliance-engineer
```
Legal DPIA, acceptable-use policy, regulatory mapping →
```
ai-risk-governance
```
Harm categories, jailbreak benchmarks, toxic classifiers →
```
ml-research-engineer-safeguards
```
Deploy gateways, canaries, safety-path SLOs →
```
ml-infrastructure-engineer-safeguards
```
Red-team attack campaigns →
```
ai-redteam
```
Enterprise data governance architecture →
```
data-architect
```
Human-data platform product ethics (contributor labor) →
```
product-management-human-data-platform
```
General literature review unrelated to privacy in ML →
```
ai-researcher
```

为GDPR/SOC 2认证审核证据管道 → 请联系
```
compliance-engineer
```
法律层面的DPIA、可接受使用政策、法规映射 → 请联系
```
ai-risk-governance
```
危害类别定义、越狱基准测试、毒性分类器 → 请联系
```
ml-research-engineer-safeguards
```
部署网关、金丝雀测试、安全路径SLO → 请联系
```
ml-infrastructure-engineer-safeguards
```
红队攻击演练 → 请联系
```
ai-redteam
```
企业数据治理架构 → 请联系
```
data-architect
```
人类数据平台产品伦理（贡献者劳工相关） → 请联系
```
product-management-human-data-platform
```
与机器学习隐私无关的通用文献综述 → 请联系
```
ai-researcher
```

Related skills

Need	Skill
Safety classifier research	`ml-research-engineer-safeguards`
Safeguard production infra	`ml-infrastructure-engineer-safeguards`
AI governance and DPIA framing	`ai-risk-governance`
Compliance controls and evidence	`compliance-engineer`
Data classification and lineage	`data-architect`
Adversarial extraction testing	`ai-redteam`
General research methods	`ai-researcher`
Human-data platform privacy	`product-management-human-data-platform`
Release and incident ops	`ai-lead-ops`

需求	技能
安全分类器研究	`ml-research-engineer-safeguards`
防护系统生产基础设施	`ml-infrastructure-engineer-safeguards`
AI治理与DPIA框架搭建	`ai-risk-governance`
合规控制与证据管理	`compliance-engineer`
数据分类与血缘管理	`data-architect`
对抗性提取测试	`ai-redteam`
通用研究方法	`ai-researcher`
人类数据平台隐私	`product-management-human-data-platform`
发布与事件运维	`ai-lead-ops`

Core Workflows

核心工作流

1. Privacy research framing

1. 隐私研究框架搭建

Threat model, metrics, baselines.

See
references/privacy_research_framing.md
.

威胁建模、指标设定、基准线确立。

详见
references/privacy_research_framing.md
。

2. PII detection and redaction research

2. PII检测与编辑研究

Detectors, redaction quality, evals.

See
references/pii_detection_redaction_research.md
.

检测器、编辑质量、评估工作。

详见
references/pii_detection_redaction_research.md
。

3. Memorization and extraction

3. 记忆与提取研究

Leakage studies, attack surfaces.

See
references/memorization_and_extraction.md
.

泄露研究、攻击面分析。

详见
references/memorization_and_extraction.md
。

4. Privacy benchmarks and datasets

4. 隐私基准与数据集

Corpora, labeling, versioning.

See
references/privacy_benchmarks_datasets.md
.

语料库、标注、版本管理。

详见
references/privacy_benchmarks_datasets.md
。

5. Logging and retention minimization

5. 日志与留存最小化

Safety observability without over-collection.

See
references/logging_retention_minimization.md
.

在不过度收集的前提下实现安全可观测性。

详见
references/logging_retention_minimization.md
。

6. Handoff to production

6. 向生产环境交付

Promotion bar, monitoring hooks.

See
references/privacy_to_production_handoff.md
.

准入标准、监控钩子。

详见
references/privacy_to_production_handoff.md
。

Outputs

产出物

Threat model — assets, adversaries, failure modes for privacy in safeguards
Benchmark spec — PII types, locales, adversarial variants
Results table — detection/redaction metrics by slice (language, format)
Leakage study report — methodology, findings, confidence
Logging policy draft — fields allowed, TTL, access controls (engineering input to legal)
Promotion recommendation — go/no-go with privacy–utility summary

威胁模型——防护系统中的隐私资产、攻击者、失效模式
基准规范——PII类型、地区、对抗性变体
结果表格——按维度（语言、格式）划分的检测/编辑指标
泄露研究报告——方法论、发现、置信度
日志政策草案——允许存储的字段、TTL、访问控制（为法务提供工程输入）
推广建议——基于隐私-效用总结的上线/不上线决策

Principles

原则

Minimize data — collect and retain only what eval and ops truly need
Separate privacy from safety metrics — low PII leak rate is not interchangeable with low toxicity FN
Locale and format matter — email in one language ≠ global PII detector
Synthetic ≠ risk-free — synthetic PII can still encode patterns; document limits
Legal review for human data — research plans involving real user content need governance sign-off

数据最小化——仅收集和留存评估与运维真正需要的数据
隐私与安全指标分离——低PII泄露率不能等同于低毒性漏报率
地区与格式至关重要——单一语言的邮箱检测≠通用PII检测器
合成数据≠无风险——合成PII仍可能编码模式；需记录其局限性
人类数据需法务审核——涉及真实用户内容的研究计划需获得治理层面的批准