Loading...
Loading...
Found 2 Skills
Guides ML/research engineering for safeguards—safety classifier development, harm benchmarks and eval suites, labeled dataset design, fine-tuning and ablations, calibration and slice analysis, attack-surface research memos, and promotion criteria for new moderation models. Use when building or evaluating guardrail models, designing safety benchmarks, measuring precision/recall on policy categories, comparing mitigation techniques, or writing research reports on classifier improvements—not for production inference gateways (ml-infrastructure-engineer-safeguards), PII/leakage privacy research (privacy-research-engineer-safeguards), red-team attack campaigns (ai-redteam), AI governance policy (ai-risk-governance), general non-safety research (ai-researcher), or token-efficiency studies (research-engineer-scientist-tokens).
Guides privacy research engineering for safeguards—PII and sensitive-data detection research, redaction and de-identification evals, memorization and extraction risk studies, privacy benchmarks and labeled corpora, logging/retention minimization for safety pipelines, and research memos on privacy–utility trade-offs for guardrail systems. Use when measuring PII detector quality, designing privacy eval suites for moderation stacks, studying training-data leakage or prompt logging risk, or recommending privacy mitigations for safeguard models—not for SOC 2/GDPR evidence automation (compliance-engineer), legal DPIA or AI policy (ai-risk-governance), harm/toxicity classifier R&D (ml-research-engineer-safeguards), production inference gateways (ml-infrastructure-engineer-safeguards), or general non-privacy research (ai-researcher).