privacy-research-engineer-safeguards

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Privacy Research Engineer, Safeguards

隐私研究工程师(防护方向)

When to Use

适用场景

  • Frame privacy research questions for safeguard and moderation stacks
  • Design PII detection/redaction benchmarks — precision/recall, re-identification risk
  • Evaluate de-identification techniques (mask, tokenize, synthetic replace) on realistic prompts
  • Study memorization and extraction — can models or logs leak user content?
  • Curate privacy-sensitive datasets — synthetic data, consent boundaries, labeling rules
  • Run ablations on detector architecture, threshold, or post-processing
  • Define logging minimization — what safety systems may store vs must discard
  • Write research memos with privacy–utility trade-offs and production recommendations
  • Specify promotion criteria for privacy mitigations before prod rollout
  • 为防护与审核栈构建隐私研究问题框架
  • 设计PII检测/编辑基准——精确率/召回率、重识别风险
  • 在真实提示语上评估去识别技术(掩码、 token化、合成替换)
  • 研究记忆与提取问题——模型或日志是否可能泄露用户内容?
  • 整理隐私敏感数据集——合成数据、同意边界、标注规则
  • 对检测器架构、阈值或后处理流程进行消融实验
  • 定义日志留存最小化规则——安全系统可存储与必须丢弃的内容
  • 撰写包含隐私-效用权衡分析及生产落地建议的研究备忘录
  • 制定隐私缓解措施在上线前的推广准入标准

When NOT to Use

不适用场景

  • Audit evidence pipelines for GDPR/SOC 2 attestations →
    compliance-engineer
  • Legal DPIA, acceptable-use policy, regulatory mapping →
    ai-risk-governance
  • Harm categories, jailbreak benchmarks, toxic classifiers →
    ml-research-engineer-safeguards
  • Deploy gateways, canaries, safety-path SLOs →
    ml-infrastructure-engineer-safeguards
  • Red-team attack campaigns →
    ai-redteam
  • Enterprise data governance architecture →
    data-architect
  • Human-data platform product ethics (contributor labor) →
    product-management-human-data-platform
  • General literature review unrelated to privacy in ML →
    ai-researcher
  • 为GDPR/SOC 2认证审核证据管道 → 请联系
    compliance-engineer
  • 法律层面的DPIA、可接受使用政策、法规映射 → 请联系
    ai-risk-governance
  • 危害类别定义、越狱基准测试、毒性分类器 → 请联系
    ml-research-engineer-safeguards
  • 部署网关、金丝雀测试、安全路径SLO → 请联系
    ml-infrastructure-engineer-safeguards
  • 红队攻击演练 → 请联系
    ai-redteam
  • 企业数据治理架构 → 请联系
    data-architect
  • 人类数据平台产品伦理(贡献者劳工相关) → 请联系
    product-management-human-data-platform
  • 与机器学习隐私无关的通用文献综述 → 请联系
    ai-researcher

Related skills

相关技能

NeedSkill
Safety classifier research
ml-research-engineer-safeguards
Safeguard production infra
ml-infrastructure-engineer-safeguards
AI governance and DPIA framing
ai-risk-governance
Compliance controls and evidence
compliance-engineer
Data classification and lineage
data-architect
Adversarial extraction testing
ai-redteam
General research methods
ai-researcher
Human-data platform privacy
product-management-human-data-platform
Release and incident ops
ai-lead-ops
需求技能
安全分类器研究
ml-research-engineer-safeguards
防护系统生产基础设施
ml-infrastructure-engineer-safeguards
AI治理与DPIA框架搭建
ai-risk-governance
合规控制与证据管理
compliance-engineer
数据分类与血缘管理
data-architect
对抗性提取测试
ai-redteam
通用研究方法
ai-researcher
人类数据平台隐私
product-management-human-data-platform
发布与事件运维
ai-lead-ops

Core Workflows

核心工作流

1. Privacy research framing

1. 隐私研究框架搭建

Threat model, metrics, baselines.
See
references/privacy_research_framing.md
.
威胁建模、指标设定、基准线确立。
详见
references/privacy_research_framing.md

2. PII detection and redaction research

2. PII检测与编辑研究

Detectors, redaction quality, evals.
See
references/pii_detection_redaction_research.md
.
检测器、编辑质量、评估工作。
详见
references/pii_detection_redaction_research.md

3. Memorization and extraction

3. 记忆与提取研究

Leakage studies, attack surfaces.
See
references/memorization_and_extraction.md
.
泄露研究、攻击面分析。
详见
references/memorization_and_extraction.md

4. Privacy benchmarks and datasets

4. 隐私基准与数据集

Corpora, labeling, versioning.
See
references/privacy_benchmarks_datasets.md
.
语料库、标注、版本管理。
详见
references/privacy_benchmarks_datasets.md

5. Logging and retention minimization

5. 日志与留存最小化

Safety observability without over-collection.
See
references/logging_retention_minimization.md
.
在不过度收集的前提下实现安全可观测性。
详见
references/logging_retention_minimization.md

6. Handoff to production

6. 向生产环境交付

Promotion bar, monitoring hooks.
See
references/privacy_to_production_handoff.md
.
准入标准、监控钩子。
详见
references/privacy_to_production_handoff.md

Outputs

产出物

  • Threat model — assets, adversaries, failure modes for privacy in safeguards
  • Benchmark spec — PII types, locales, adversarial variants
  • Results table — detection/redaction metrics by slice (language, format)
  • Leakage study report — methodology, findings, confidence
  • Logging policy draft — fields allowed, TTL, access controls (engineering input to legal)
  • Promotion recommendation — go/no-go with privacy–utility summary
  • 威胁模型——防护系统中的隐私资产、攻击者、失效模式
  • 基准规范——PII类型、地区、对抗性变体
  • 结果表格——按维度(语言、格式)划分的检测/编辑指标
  • 泄露研究报告——方法论、发现、置信度
  • 日志政策草案——允许存储的字段、TTL、访问控制(为法务提供工程输入)
  • 推广建议——基于隐私-效用总结的上线/不上线决策

Principles

原则

  • Minimize data — collect and retain only what eval and ops truly need
  • Separate privacy from safety metrics — low PII leak rate is not interchangeable with low toxicity FN
  • Locale and format matter — email in one language ≠ global PII detector
  • Synthetic ≠ risk-free — synthetic PII can still encode patterns; document limits
  • Legal review for human data — research plans involving real user content need governance sign-off
  • 数据最小化——仅收集和留存评估与运维真正需要的数据
  • 隐私与安全指标分离——低PII泄露率不能等同于低毒性漏报率
  • 地区与格式至关重要——单一语言的邮箱检测≠通用PII检测器
  • 合成数据≠无风险——合成PII仍可能编码模式;需记录其局限性
  • 人类数据需法务审核——涉及真实用户内容的研究计划需获得治理层面的批准