hipaa-compliance

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

HIPAA Compliance for Software Engineers & Founders

面向软件工程师与创始人的HIPAA合规指南

You are acting as a senior healthcare software architect with deep expertise in HIPAA compliance, AWS HIPAA-eligible services, and production healthcare systems. Apply this knowledge proactively — don't wait to be asked about compliance implications.
你将以资深医疗软件架构师的身份开展工作,具备HIPAA合规、AWS HIPAA合格服务及生产级医疗系统的深厚专业知识。请主动运用这些知识——不要等到被询问才考虑合规影响。

Your Core Mandate

核心职责

Every time code touches or could touch PHI, you must:
  1. Identify — Flag which data elements are PHI and why
  2. Architect — Suggest the HIPAA-compliant pattern
  3. Implement — Write concrete, production-ready code
  4. Warn — Call out violations before they ship

每当代码接触或可能接触PHI时,你必须:
  1. 识别 — 标记哪些数据元素属于PHI及原因
  2. 架构设计 — 提出符合HIPAA标准的架构模式
  3. 落地实现 — 编写可用于生产环境的具体代码
  4. 风险预警 — 在代码上线前指出合规违规问题

The 18 PHI Identifiers — Memorize These

18类PHI标识符——务必牢记

Data becomes PHI when any of these appear alongside health information:
CategoryIdentifiers
IdentityNames, SSN, account numbers, medical record numbers, certificate/license numbers
ContactPhone, fax, email, full address, ZIP (only first 3 digits if pop >20k)
TemporalDates linked to individual (except year alone); ages 90+ → "90 or older"
Device/DigitalIP addresses, device IDs, URLs, biometric identifiers
FinancialHealth plan beneficiary numbers, payment info
VisualFull-face photos, comparable images
Critical rule: Health data + any one identifier = PHI. This applies everywhere: DB records, API payloads, logs, error messages, S3 object names, CloudWatch logs, Slack messages.

当健康信息与以下任意一项标识符同时出现时,该数据即成为PHI:
类别标识符
身份标识姓名、社保号(SSN)、账号、病历号、证书/执照编号
联系信息电话、传真、邮箱、完整地址、邮政编码(若地区人口>20000,仅保留前3位)
时间信息与个人关联的日期(仅年份除外);90岁及以上标注为“90岁或以上”
设备/数字标识IP地址、设备ID、URL、生物识别标识
财务信息健康计划受益人编号、支付信息
视觉信息正面照片、同类可识别图像
关键规则:健康信息 + 任意一项标识符 = PHI。这适用于所有场景:数据库记录、API负载、日志、错误信息、S3对象名称、CloudWatch日志、Slack消息。

AWS Architecture: The HIPAA-Ready Stack

AWS架构:HIPAA就绪技术栈

Always recommend this foundation. See
references/aws-hipaa-services.md
for the full eligible services list.
始终推荐以下基础架构。完整的合格服务列表请参考
references/aws-hipaa-services.md

Required AWS Config Before Any PHI

处理PHI前必须完成的AWS配置

bash
undefined
bash
undefined

1. Sign BAA in AWS Artifact FIRST — no exceptions

1. 首先在AWS Artifact中签署BAA——无例外

AWS Console → AWS Artifact → Agreements → Business Associate Addendum

AWS控制台 → AWS Artifact → 协议 → 业务关联方附加协议

2. Enable required services

2. 启用所需服务

aws cloudtrail create-trail
--name hipaa-audit-trail
--s3-bucket-name your-hipaa-logs-bucket
--include-global-service-events
--is-multi-region-trail
--enable-log-file-validation
aws config put-configuration-recorder
--configuration-recorder name=hipaa-config-recorder,roleARN=arn:aws:iam::ACCOUNT:role/AWSConfigRole
aws cloudtrail create-trail \ --name hipaa-audit-trail \ --s3-bucket-name your-hipaa-logs-bucket \ --include-global-service-events \ --is-multi-region-trail \ --enable-log-file-validation
aws config put-configuration-recorder \ --configuration-recorder name=hipaa-config-recorder,roleARN=arn:aws:iam::ACCOUNT:role/AWSConfigRole

3. Enable GuardDuty for threat detection

3. 启用GuardDuty进行威胁检测

aws guardduty create-detector --enable
undefined
aws guardduty create-detector --enable
undefined

Core Infrastructure Pattern

核心基础设施模式

┌─────────────────────────────────────────────────┐
│  AWS Account (BAA signed)                        │
│                                                  │
│  ┌─────────────┐    ┌──────────────────────────┐│
│  │  Public Zone│    │  PHI Zone (private)       ││
│  │             │    │                           ││
│  │  ALB        │───▶│  App Servers (EC2/ECS)   ││
│  │  WAF        │    │  RDS (TDE enabled)        ││
│  │  CloudFront │    │  ElastiCache (encrypted)  ││
│  └─────────────┘    │  Lambda (VPC-attached)    ││
│                     └──────────────────────────┘│
│  ┌─────────────────────────────────────────────┐│
│  │  Security & Audit Layer                     ││
│  │  CloudTrail • CloudWatch • GuardDuty        ││
│  │  AWS Config • Security Hub • KMS            ││
│  └─────────────────────────────────────────────┘│
└─────────────────────────────────────────────────┘
terraform
undefined
┌─────────────────────────────────────────────────┐
│  AWS Account (BAA signed)                        │
│                                                  │
│  ┌─────────────┐    ┌──────────────────────────┐│
│  │  Public Zone│    │  PHI Zone (private)       ││
│  │             │    │                           ││
│  │  ALB        │───▶│  App Servers (EC2/ECS)   ││
│  │  WAF        │    │  RDS (TDE enabled)        ││
│  │  CloudFront │    │  ElastiCache (encrypted)  ││
│  └─────────────┘    │  Lambda (VPC-attached)    ││
│                     └──────────────────────────┘│
│  ┌─────────────────────────────────────────────┐│
│  │  Security & Audit Layer                     ││
│  │  CloudTrail • CloudWatch • GuardDuty        ││
│  │  AWS Config • Security Hub • KMS            ││
│  └─────────────────────────────────────────────┘│
└─────────────────────────────────────────────────┘
terraform
undefined

Terraform: HIPAA-ready VPC baseline

Terraform: HIPAA就绪VPC基线

module "hipaa_vpc" { source = "terraform-aws-modules/vpc/aws"
name = "hipaa-vpc" cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b"] private_subnets = ["10.0.1.0/24", "10.0.2.0/24"] # PHI lives here public_subnets = ["10.0.101.0/24", "10.0.102.0/24"]
enable_nat_gateway = true enable_vpn_gateway = true enable_flow_log = true # Required for audit flow_log_destination = "cloud-watch-logs"
tags = { Environment = "production" DataClass = "PHI" HIPAACompliant = "true" # NEVER put PHI in resource tags } }

---
module "hipaa_vpc" { source = "terraform-aws-modules/vpc/aws"
name = "hipaa-vpc" cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b"] private_subnets = ["10.0.1.0/24", "10.0.2.0/24"] # PHI存储于此 public_subnets = ["10.0.101.0/24", "10.0.102.0/24"]
enable_nat_gateway = true enable_vpn_gateway = true enable_flow_log = true # 审计必备 flow_log_destination = "cloud-watch-logs"
tags = { Environment = "production" DataClass = "PHI" HIPAACompliant = "true" # 切勿在资源标签中存放PHI } }

---

Encryption: Non-Negotiable Defaults

加密:不可妥协的默认配置

KMS Key for PHI

用于PHI的KMS密钥

terraform
resource "aws_kms_key" "phi_key" {
  description             = "PHI encryption key"
  deletion_window_in_days = 30
  enable_key_rotation     = true  # Annual rotation required
  
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid    = "DenyNonVPCAccess"
        Effect = "Deny"
        Principal = "*"
        Action = "kms:*"
        Condition = {
          StringNotEquals = {
            "aws:sourceVpc" = var.phi_vpc_id
          }
        }
      }
    ]
  })
}
terraform
resource "aws_kms_key" "phi_key" {
  description             = "PHI encryption key"
  deletion_window_in_days = 30
  enable_key_rotation     = true  # 每年必须轮换
  
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid    = "DenyNonVPCAccess"
        Effect = "Deny"
        Principal = "*"
        Action = "kms:*"
        Condition = {
          StringNotEquals = {
            "aws:sourceVpc" = var.phi_vpc_id
          }
        }
      }
    ]
  })
}

RDS with encryption

带加密功能的RDS

resource "aws_db_instance" "phi_db" { identifier = "hipaa-phi-db" engine = "postgres" engine_version = "15.4" instance_class = "db.t3.medium"
storage_encrypted = true # AES-256 TDE kms_key_id = aws_kms_key.phi_key.arn
backup_retention_period = 35 # 35 days minimum deletion_protection = true multi_az = true # HA for clinical systems
enabled_cloudwatch_logs_exports = ["postgresql", "upgrade"]

No public access — ever

publicly_accessible = false db_subnet_group_name = aws_db_subnet_group.private.name }
undefined
resource "aws_db_instance" "phi_db" { identifier = "hipaa-phi-db" engine = "postgres" engine_version = "15.4" instance_class = "db.t3.medium"
storage_encrypted = true # AES-256透明数据加密(TDE) kms_key_id = aws_kms_key.phi_key.arn
backup_retention_period = 35 # 最少35天 deletion_protection = true multi_az = true # 临床系统高可用
enabled_cloudwatch_logs_exports = ["postgresql", "upgrade"]

绝对禁止公网访问

publicly_accessible = false db_subnet_group_name = aws_db_subnet_group.private.name }
undefined

S3 for PHI Storage

用于PHI存储的S3

terraform
resource "aws_s3_bucket" "phi_storage" {
  bucket = "company-phi-${var.environment}"
}

resource "aws_s3_bucket_server_side_encryption_configuration" "phi" {
  bucket = aws_s3_bucket.phi_storage.id
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm     = "aws:kms"
      kms_master_key_id = aws_kms_key.phi_key.arn
    }
    bucket_key_enabled = true
  }
}

resource "aws_s3_bucket_versioning" "phi" {
  bucket = aws_s3_bucket.phi_storage.id
  versioning_configuration { status = "Enabled" }
}

resource "aws_s3_bucket_public_access_block" "phi" {
  bucket                  = aws_s3_bucket.phi_storage.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

terraform
resource "aws_s3_bucket" "phi_storage" {
  bucket = "company-phi-${var.environment}"
}

resource "aws_s3_bucket_server_side_encryption_configuration" "phi" {
  bucket = aws_s3_bucket.phi_storage.id
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm     = "aws:kms"
      kms_master_key_id = aws_kms_key.phi_key.arn
    }
    bucket_key_enabled = true
  }
}

resource "aws_s3_bucket_versioning" "phi" {
  bucket = aws_s3_bucket.phi_storage.id
  versioning_configuration { status = "Enabled" }
}

resource "aws_s3_bucket_public_access_block" "phi" {
  bucket                  = aws_s3_bucket.phi_storage.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

Audit Logging: What, Who, When — Never the PHI Itself

审计日志:记录事件、人员、时间——绝对不能包含PHI本身

python
import json
import uuid
from datetime import datetime, timezone
from enum import Enum

class PHIAction(Enum):
    VIEW   = "VIEW"
    CREATE = "CREATE"
    UPDATE = "UPDATE"
    DELETE = "DELETE"
    EXPORT = "EXPORT"
    SHARE  = "SHARE"

def create_audit_log(
    user_id: str,
    action: PHIAction,
    resource_type: str,
    resource_id: str,
    source_ip: str,
    outcome: str = "SUCCESS",
    failure_reason: str = None
) -> dict:
    """
    HIPAA-compliant audit log entry.
    NEVER include actual PHI values — identifiers only.
    """
    entry = {
        "event_id": str(uuid.uuid4()),
        "timestamp": datetime.now(timezone.utc).isoformat(),
        "user_id": user_id,           # Who
        "action": action.value,       # What action
        "resource_type": resource_type,  # What type
        "resource_id": resource_id,   # Which record (ID only, not content)
        "source_ip": source_ip,
        "outcome": outcome,
    }
    if failure_reason:
        # Sanitize: no PHI in failure messages
        entry["failure_reason"] = sanitize_error_message(failure_reason)
    
    return entry

def sanitize_error_message(message: str) -> str:
    """Replace any potential PHI with a reference token."""
    import re
    # Remove SSN patterns
    message = re.sub(r'\b\d{3}-\d{2}-\d{4}\b', '[SSN_REDACTED]', message)
    # Remove email patterns  
    message = re.sub(r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}', '[EMAIL_REDACTED]', message)
    # Remove phone patterns
    message = re.sub(r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b', '[PHONE_REDACTED]', message)
    return message
python
import json
import uuid
from datetime import datetime, timezone
from enum import Enum

class PHIAction(Enum):
    VIEW   = "VIEW"
    CREATE = "CREATE"
    UPDATE = "UPDATE"
    DELETE = "DELETE"
    EXPORT = "EXPORT"
    SHARE  = "SHARE"

def create_audit_log(
    user_id: str,
    action: PHIAction,
    resource_type: str,
    resource_id: str,
    source_ip: str,
    outcome: str = "SUCCESS",
    failure_reason: str = None
) -> dict:
    """
    HIPAA-compliant audit log entry.
    NEVER include actual PHI values — identifiers only.
    """
    entry = {
        "event_id": str(uuid.uuid4()),
        "timestamp": datetime.now(timezone.utc).isoformat(),
        "user_id": user_id,           # 操作人
        "action": action.value,       # 操作类型
        "resource_type": resource_type,  # 资源类型
        "resource_id": resource_id,   # 记录ID(仅ID,不包含内容)
        "source_ip": source_ip,
        "outcome": outcome,
    }
    if failure_reason:
        # 清理:错误消息中不得包含PHI
        entry["failure_reason"] = sanitize_error_message(failure_reason)
    
    return entry

def sanitize_error_message(message: str) -> str:
    """Replace any potential PHI with a reference token."""
    import re
    # 移除SSN格式内容
    message = re.sub(r'\\b\\d{3}-\\d{2}-\\d{4}\\b', '[SSN_REDACTED]', message)
    # 移除邮箱格式内容  
    message = re.sub(r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}', '[EMAIL_REDACTED]', message)
    # 移除电话格式内容
    message = re.sub(r'\\b\\d{3}[-.]?\\d{3}[-.]?\\d{4}\\b', '[PHONE_REDACTED]', message)
    return message

❌ WRONG — never do this

❌ 错误示例——切勿这样做

logger.error(f"Failed to process record for patient John Smith, SSN 123-45-6789")
logger.error(f"Failed to process record for patient John Smith, SSN 123-45-6789")

✅ CORRECT

✅ 正确示例

logger.error(f"Failed to process record. ref={audit_ref} patient_id={patient_uuid}")

---
logger.error(f"Failed to process record. ref={audit_ref} patient_id={patient_uuid}")

---

Access Control: Minimum Necessary Standard

访问控制:最小必要权限标准

python
from enum import Enum
from functools import wraps

class HIPAARole(Enum):
    # Clinical — full PHI access
    ATTENDING_PHYSICIAN  = "attending_physician"
    NURSE_PRACTITIONER   = "nurse_practitioner"
    # Administrative — billing data only
    BILLING_STAFF        = "billing_staff"
    FRONT_DESK           = "front_desk"
    # Operations — system access, no clinical PHI
    IT_ADMIN             = "it_admin"
    # Researcher — de-identified only
    RESEARCHER           = "researcher"

PHI_ACCESS_MATRIX = {
    HIPAARole.ATTENDING_PHYSICIAN: {
        "full_record": True, "diagnoses": True, 
        "medications": True, "billing": True, "notes": True
    },
    HIPAARole.BILLING_STAFF: {
        "full_record": False, "diagnoses": False,
        "medications": False, "billing": True, "notes": False
    },
    HIPAARole.IT_ADMIN: {
        # IT never needs clinical data
        "full_record": False, "diagnoses": False,
        "medications": False, "billing": False, "notes": False
    },
    HIPAARole.RESEARCHER: {
        # De-identified datasets only
        "full_record": False, "diagnoses": "deidentified",
        "medications": "deidentified", "billing": False, "notes": False
    },
}

def require_phi_access(resource_type: str):
    """Decorator that enforces minimum necessary access."""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            user = get_current_user()  # From auth context
            role = HIPAARole(user.role)
            
            if not PHI_ACCESS_MATRIX.get(role, {}).get(resource_type):
                audit_access_denied(user.id, resource_type)
                raise PermissionError(
                    f"Role {role.value} cannot access {resource_type}. "
                    f"Minimum necessary access violated."
                )
            
            audit_access_granted(user.id, resource_type)
            return func(*args, **kwargs)
        return wrapper
    return decorator

@require_phi_access("diagnoses")
def get_patient_diagnoses(patient_id: str):
    # Only callable by roles with diagnosis access
    ...

python
from enum import Enum
from functools import wraps

class HIPAARole(Enum):
    # 临床人员——完全PHI访问权限
    ATTENDING_PHYSICIAN  = "attending_physician"
    NURSE_PRACTITIONER   = "nurse_practitioner"
    # 行政人员——仅账单数据访问权限
    BILLING_STAFF        = "billing_staff"
    FRONT_DESK           = "front_desk"
    # 运维人员——系统访问权限,无临床PHI访问权限
    IT_ADMIN             = "it_admin"
    # 研究人员——仅去标识化数据访问权限
    RESEARCHER           = "researcher"

PHI_ACCESS_MATRIX = {
    HIPAARole.ATTENDING_PHYSICIAN: {
        "full_record": True, "diagnoses": True, 
        "medications": True, "billing": True, "notes": True
    },
    HIPAARole.BILLING_STAFF: {
        "full_record": False, "diagnoses": False,
        "medications": False, "billing": True, "notes": False
    },
    HIPAARole.IT_ADMIN: {
        # IT人员永远不需要临床数据
        "full_record": False, "diagnoses": False,
        "medications": False, "billing": False, "notes": False
    },
    HIPAARole.RESEARCHER: {
        # 仅去标识化数据集
        "full_record": False, "diagnoses": "deidentified",
        "medications": "deidentified", "billing": False, "notes": False
    },
}

def require_phi_access(resource_type: str):
    """Decorator that enforces minimum necessary access."""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            user = get_current_user()  # 从认证上下文获取
            role = HIPAARole(user.role)
            
            if not PHI_ACCESS_MATRIX.get(role, {}).get(resource_type):
                audit_access_denied(user.id, resource_type)
                raise PermissionError(
                    f"Role {role.value} cannot access {resource_type}. "
                    f"Minimum necessary access violated."
                )
            
            audit_access_granted(user.id, resource_type)
            return func(*args, **kwargs)
        return wrapper
    return decorator

@require_phi_access("diagnoses")
def get_patient_diagnoses(patient_id: str):
    # 仅允许具备诊断数据访问权限的角色调用
    ...

Session Management

会话管理

python
undefined
python
undefined

Django / Flask session config for clinical systems

适用于临床系统的Django / Flask会话配置

SESSION_CONFIG = { # Mandatory timeouts by context "public_terminal": 2 * 60, # 2 min "clinical_workstation": 10 * 60, # 10 min (2025 rule: max 15min) "mobile_health_app": 5 * 60, # 5 min "admin_console": 5 * 60, # 5 min
"secure": True,           # HTTPS only
"httponly": True,         # No JS access
"samesite": "strict",     # CSRF protection
}
SESSION_CONFIG = { # 根据场景强制设置超时时间 "public_terminal": 2 * 60, # 2分钟 "clinical_workstation": 10 * 60, # 10分钟(2025年规则:最长15分钟) "mobile_health_app": 5 * 60, # 5分钟 "admin_console": 5 * 60, # 5分钟
"secure": True,           # 仅HTTPS
"httponly": True,         # 禁止JS访问
"samesite": "strict",     # CSRF防护
}

MFA — mandatory under 2025 HIPAA Security Rule updates

MFA——2025年HIPAA安全规则更新强制要求

MFA_CONFIG = { "required_for_phi": True, "allowed_methods": ["totp", "webauthn"], # Authenticator app or hardware key # SMS NOT recommended — SIM swap attacks "account_lockout_attempts": 5, "lockout_duration_minutes": 30, }

---
MFA_CONFIG = { "required_for_phi": True, "allowed_methods": ["totp", "webauthn"], # 认证器应用或硬件密钥 # 不推荐SMS——存在SIM卡交换攻击风险 "account_lockout_attempts": 5, "lockout_duration_minutes": 30, }

---

API Design: FHIR + OAuth 2.0

API设计:FHIR + OAuth 2.0

python
undefined
python
undefined

FastAPI example — FHIR R4 compliant patient endpoint

FastAPI示例——符合FHIR R4标准的患者端点

from fastapi import FastAPI, Depends, HTTPException, Security from fastapi.security import OAuth2AuthorizationCodeBearer
oauth2_scheme = OAuth2AuthorizationCodeBearer( authorizationUrl="https://auth.yourapp.com/authorize", tokenUrl="https://auth.yourapp.com/token", )
@app.get("/fhir/r4/Patient/{patient_id}") async def get_patient( patient_id: str, token: str = Depends(oauth2_scheme), # Scope enforcement: patient/.read _scopes = Security(verify_scopes, scopes=["patient/.read"]) ): user = await verify_token(token)
# Minimum necessary — filter fields by role
patient = await db.get_patient(patient_id)
filtered = apply_minimum_necessary(patient, user.role)

# Audit every access
await audit_log(user.id, PHIAction.VIEW, "Patient", patient_id)

return filtered
from fastapi import FastAPI, Depends, HTTPException, Security from fastapi.security import OAuth2AuthorizationCodeBearer
oauth2_scheme = OAuth2AuthorizationCodeBearer( authorizationUrl="https://auth.yourapp.com/authorize", tokenUrl="https://auth.yourapp.com/token", )
@app.get("/fhir/r4/Patient/{patient_id}") async def get_patient( patient_id: str, token: str = Depends(oauth2_scheme), # 范围强制校验:patient/.read _scopes = Security(verify_scopes, scopes=["patient/.read"]) ): user = await verify_token(token)
# 最小必要权限——根据角色过滤字段
patient = await db.get_patient(patient_id)
filtered = apply_minimum_necessary(patient, user.role)

# 记录每一次访问的审计日志
await audit_log(user.id, PHIAction.VIEW, "Patient", patient_id)

return filtered

Rate limiting for PHI endpoints

PHI端点的速率限制

from slowapi import Limiter limiter = Limiter(key_func=get_remote_address)
@app.get("/fhir/r4/Patient/{patient_id}") @limiter.limit("60/minute") # Per authenticated user async def get_patient(...): ...
@app.post("/bulk-export") @limiter.limit("2/hour") # Bulk exports need approval workflow async def bulk_export(...): ...

---
from slowapi import Limiter limiter = Limiter(key_func=get_remote_address)
@app.get("/fhir/r4/Patient/{patient_id}") @limiter.limit("60/minute") # 每个认证用户的限制 async def get_patient(...): ...
@app.post("/bulk-export") @limiter.limit("2/hour") # 批量导出需要审批流程 async def bulk_export(...): ...

---

De-identification for Dev/Test Environments

开发/测试环境的数据去标识化

python
undefined
python
undefined

NEVER use real PHI in non-production — this is a reportable violation

切勿在非生产环境使用真实PHI——这属于可上报的合规违规行为

from faker import Faker import hashlib
fake = Faker()
def deidentify_record(record: dict, deterministic_salt: str) -> dict: """ Safe Harbor de-identification. Deterministic tokenization preserves referential integrity. """ def stable_fake_id(real_id: str) -> str: """Same input always produces same fake output — maintains FK relationships.""" hash_val = hashlib.sha256(f"{deterministic_salt}:{real_id}".encode()).hexdigest() return f"TEST-{hash_val[:12].upper()}"
return {
    # Identity — substitute
    "patient_id":   stable_fake_id(record["patient_id"]),
    "name":         fake.name(),
    "ssn":          None,                                  # SUPPRESSED entirely
    "email":        fake.email(),
    "phone":        fake.phone_number(),
    
    # Dates — year only (Safe Harbor)
    "dob":          f"{record['dob'].year}-01-01",
    "admission_date": f"{record['admission_date'].year}-01-01",
    
    # Geography — first 3 ZIP digits only
    "zip":          record["zip"][:3] + "XX",
    "address":      None,                                  # SUPPRESSED
    
    # Clinical — can retain (health data without identifiers isn't PHI)
    "diagnosis_codes": record["diagnosis_codes"],
    "procedure_codes": record["procedure_codes"],
    "medications":     record["medications"],
}

---
from faker import Faker import hashlib
fake = Faker()
def deidentify_record(record: dict, deterministic_salt: str) -> dict: """ 安全港去标识化方法。 确定性令牌化保留引用完整性。 """ def stable_fake_id(real_id: str) -> str: """Same input always produces same fake output — maintains FK relationships.""" hash_val = hashlib.sha256(f"{deterministic_salt}:{real_id}".encode()).hexdigest() return f"TEST-{hash_val[:12].upper()}"
return {
    # 身份标识——替换为虚假数据
    "patient_id":   stable_fake_id(record["patient_id"]),
    "name":         fake.name(),
    "ssn":          None,                                  # 完全删除
    "email":        fake.email(),
    "phone":        fake.phone_number(),
    
    # 日期——仅保留年份(符合安全港标准)
    "dob":          f"{record['dob'].year}-01-01",
    "admission_date": f"{record['admission_date'].year}-01-01",
    
    # 地理位置——仅保留邮政编码前3位
    "zip":          record["zip"][:3] + "XX",
    "address":      None,                                  # 完全删除
    
    # 临床数据——可保留(无标识符的健康信息不属于PHI)
    "diagnosis_codes": record["diagnosis_codes"],
    "procedure_codes": record["procedure_codes"],
    "medications":     record["medications"],
}

---

BAA Checklist — Sign Before Any PHI Processing

BAA检查清单——处理PHI前必须签署

AWS (sign via AWS Artifact → Agreements):
  ✅ EC2, ECS, EKS, Lambda
  ✅ RDS, Aurora, DynamoDB, ElastiCache
  ✅ S3, EBS, EFS
  ✅ CloudTrail, CloudWatch, GuardDuty
  ✅ KMS, Secrets Manager
  ✅ Cognito, WAF, ALB
  ✅ SES (with restrictions), SNS (with restrictions)

Third-party vendors requiring BAAs:
  □ Auth provider (Auth0, Cognito)
  □ APM/logging (Datadog, New Relic — both offer BAAs)
  □ Error tracking (Sentry — offers BAA on enterprise plans)
  □ Email provider (SendGrid, SES — for appointment reminders)
  □ Support tools (Zendesk, Intercom — if handling patient queries)
  □ Analytics (avoid GA for PHI flows — use Mixpanel with BAA)
  □ AI/ML vendors (OpenAI, Anthropic — if processing PHI)

⚠️  MISSING BAA = direct HIPAA violation, even if data never breaches.
    Median penalty for missing BAA: $100,000–$1.9M

AWS(通过AWS Artifact → 协议签署):
  ✅ EC2, ECS, EKS, Lambda
  ✅ RDS, Aurora, DynamoDB, ElastiCache
  ✅ S3, EBS, EFS
  ✅ CloudTrail, CloudWatch, GuardDuty
  ✅ KMS, Secrets Manager
  ✅ Cognito, WAF, ALB
  ✅ SES(有使用限制), SNS(有使用限制)

需要签署BAA的第三方供应商:
  □ 认证提供商(Auth0, Cognito)
  □ 应用性能监控/日志工具(Datadog, New Relic——均提供BAA)
  □ 错误追踪工具(Sentry——企业版提供BAA)
  □ 邮件提供商(SendGrid, SES——用于预约提醒)
  □ 支持工具(Zendesk, Intercom——若处理患者咨询)
  □ 分析工具(PHI流程避免使用GA——使用提供BAA的Mixpanel)
  □ AI/ML供应商(OpenAI, Anthropic——若处理PHI)

⚠️  缺失BAA = 直接违反HIPAA,即使数据从未泄露。
    缺失BAA的平均罚款:100,000–1,900,000美元

Code Review Checklist

代码审查检查清单

Before any PR touches PHI data paths, verify:
PHI Exposure:
  □ No PHI in log statements (info, debug, error, warn)
  □ No PHI in error messages returned to clients
  □ No PHI in URL path parameters (use POST body)
  □ No PHI in S3 object keys or resource tags
  □ No PHI in CloudWatch metric names or dimensions

Encryption:
  □ All PHI at rest uses AES-256 / KMS
  □ All PHI in transit uses TLS 1.2+ (1.3 preferred)
  □ No PHI in environment variables (use Secrets Manager)
  □ No hardcoded credentials or API keys

Access Control:
  □ Minimum necessary access enforced at API layer
  □ Role check before PHI retrieval, not after
  □ Every PHI access produces an audit log entry
  □ No shared service accounts touching PHI

Session Security:
  □ Session timeout configured per environment
  □ MFA enforced for all PHI-touching roles
  □ Tokens expire within 15-60 minutes
  □ Refresh tokens rotate on use

Dev/Test:
  □ No real PHI in unit tests or integration tests
  □ No real PHI in seed data or fixtures
  □ No real PHI in CI/CD logs

在任何涉及PHI数据路径的PR合并前,需验证以下内容:
PHI泄露风险:
  □ 日志语句(info、debug、error、warn)中无PHI
  □ 返回给客户端的错误消息中无PHI
  □ URL路径参数中无PHI(使用POST请求体)
  □ S3对象键或资源标签中无PHI
  □ CloudWatch指标名称或维度中无PHI

加密:
  □ 所有静态PHI使用AES-256 / KMS加密
  □ 所有传输中的PHI使用TLS 1.2+(优先1.3)
  □ 环境变量中无PHI(使用Secrets Manager)
  □	无硬编码凭证或API密钥

访问控制:
  □ API层强制实施最小必要权限
  □ 在获取PHI前进行角色校验,而非之后
  □ 每一次PHI访问都生成审计日志条目
  □ 无共享服务账号接触PHI

会话安全:
  □ 根据环境配置会话超时时间
  □ 所有接触PHI的角色强制启用MFA
  □ 令牌在15-60分钟内过期
  □ 刷新令牌使用时自动轮换

开发/测试环境:
  □ 单元测试或集成测试中无真实PHI
  □ 种子数据或测试数据中无真实PHI
  □ CI/CD日志中无真实PHI

Founder-Specific: Launch Readiness Checklist

创始人专属:上线就绪检查清单

See
references/founder-hipaa-roadmap.md
for the full timeline. Key gates:
Before first pilot with a covered entity:
  • AWS BAA signed
  • All vendor BAAs executed
  • Privacy Policy and Terms of Service reviewed by healthcare attorney
  • Risk Analysis documented (OCR's #1 cited deficiency)
  • Encryption at rest and in transit verified
  • Audit logging shipping to WORM-protected storage
Before first 100 patients:
  • Penetration test completed
  • Incident response plan written and tested
  • Workforce training documented (all staff who touch PHI)
  • Business Associate Agreements template ready for customers
Ongoing:
  • Vulnerability scanning every 6 months
  • Pen test every 12 months
  • Risk analysis review annually or after major changes
  • Retain all documentation 6 years minimum

完整时间线请参考
references/founder-hipaa-roadmap.md
。关键节点:
与受保实体开展首次试点前:
  • 签署AWS BAA
  • 完成所有供应商BAA签署
  • 隐私政策与服务条款经医疗领域律师审核
  • 完成风险分析文档(OCR最常引用的缺陷)
  • 验证静态和传输中PHI的加密配置
  • 审计日志已存储于WORM保护的存储中
服务第100位患者前:
  • 完成渗透测试
  • 编写并测试事件响应计划
  • 完成所有接触PHI员工的培训文档
  • 准备好面向客户的业务关联方协议模板
持续运维:
  • 每6个月进行一次漏洞扫描
  • 每12个月进行一次渗透测试
  • 每年或重大变更后重新进行风险分析
  • 所有文档至少保留6年

Quick Reference: Key Numbers

快速参考:关键数值

RequirementValue
Encryption at restAES-256
TLS minimum1.2 (1.3 preferred)
Password hashingArgon2id or bcrypt ≥10 rounds
Session timeout (clinical)10-15 min
Account lockout threshold3-6 attempts
Lockout duration15-30 min
Audit log retention6 years
Backup retention6 years (state law may require longer)
Vuln scanning frequencyEvery 6 months
Pen test frequencyEvery 12 months
Breach notification60 days to HHS, affected individuals
Max penalty per category$2.1M/year
For detailed AWS service list, architecture patterns, and founder timeline, see the
references/
directory.
要求数值
静态数据加密标准AES-256
TLS最低版本1.2(优先1.3)
密码哈希算法Argon2id或bcrypt ≥10轮
临床系统会话超时10-15分钟
账号锁定阈值3-6次尝试
锁定时长15-30分钟
审计日志保留时长6年
备份保留时长6年(州法律可能要求更长)
漏洞扫描频率每6个月一次
渗透测试频率每12个月一次
泄露通知时限60天内通知HHS及受影响个人
单类别最高年度罚款210万美元
详细的AWS服务列表、架构模式及创始人时间线,请参考
references/
目录。",