unity-catalog-governance
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseUnity Catalog Governance Skill
Unity Catalog治理技能
Overview
概述
Unity Catalog provides centralized governance for data and AI assets across Databricks workspaces. This skill covers governance patterns, security models, access control, and compliance frameworks.
Unity Catalog为Databricks工作区中的数据和AI资产提供集中式治理。本技能涵盖治理模式、安全模型、访问控制以及合规框架。
When to Use This Skill
何时使用本技能
Use this skill when you need to:
- Design Unity Catalog security architecture
- Implement RBAC/ABAC access controls
- Configure row-level security and column masking
- Set up data classification and tagging
- Implement compliance requirements (GDPR, HIPAA, SOC2)
- Audit permissions and access patterns
当你需要以下操作时,可使用本技能:
- 设计Unity Catalog安全架构
- 实现RBAC/ABAC访问控制
- 配置行级安全与列掩码
- 建立数据分类与标签体系
- 满足合规要求(GDPR、HIPAA、SOC2)
- 审计权限与访问模式
Core Concepts
核心概念
Three-Level Namespace
三级命名空间
catalog.schema.table
production.customers.profilescatalog.schema.table
production.customers.profilesSecurable Objects
可管控对象
- Metastore: Top-level container
- Catalog: Database container
- Schema: Table container
- Table/View: Data objects
- Volume: File storage
- Function: Callable routines
- External Location: External storage
- Storage Credential: Cloud credentials
- Metastore:顶层容器
- Catalog:数据库容器
- Schema:表容器
- Table/View:数据对象
- Volume:文件存储
- Function:可调用例程
- External Location:外部存储
- Storage Credential:云凭证
Privilege Model
权限模型
sql
-- Catalog privileges
USE CATALOG, CREATE SCHEMA, USE SCHEMA, CREATE TABLE,
CREATE FUNCTION, CREATE VOLUME, ALL PRIVILEGES, OWNERSHIP
-- Schema privileges
USE SCHEMA, CREATE TABLE, CREATE FUNCTION, CREATE VOLUME,
SELECT, MODIFY, READ FILES, WRITE FILES
-- Table privileges
SELECT, MODIFY, READ METADATA
-- Function privileges
EXECUTE
-- Volume privileges
READ FILES, WRITE FILESsql
-- Catalog privileges
USE CATALOG, CREATE SCHEMA, USE SCHEMA, CREATE TABLE,
CREATE FUNCTION, CREATE VOLUME, ALL PRIVILEGES, OWNERSHIP
-- Schema privileges
USE SCHEMA, CREATE TABLE, CREATE FUNCTION, CREATE VOLUME,
SELECT, MODIFY, READ FILES, WRITE FILES
-- Table privileges
SELECT, MODIFY, READ METADATA
-- Function privileges
EXECUTE
-- Volume privileges
READ FILES, WRITE FILESGovernance Patterns
治理模式
Pattern 1: Environment Isolation
模式1:环境隔离
sql
-- Separate catalogs per environment
CREATE CATALOG IF NOT EXISTS dev;
CREATE CATALOG IF NOT EXISTS staging;
CREATE CATALOG IF NOT EXISTS production;
-- Grant appropriate access
GRANT USE CATALOG, CREATE SCHEMA ON CATALOG dev TO `developers`;
GRANT USE CATALOG, USE SCHEMA ON CATALOG staging TO `testers`;
GRANT USE CATALOG, USE SCHEMA, SELECT ON CATALOG production TO `analysts`;sql
-- Separate catalogs per environment
CREATE CATALOG IF NOT EXISTS dev;
CREATE CATALOG IF NOT EXISTS staging;
CREATE CATALOG IF NOT EXISTS production;
-- Grant appropriate access
GRANT USE CATALOG, CREATE SCHEMA ON CATALOG dev TO `developers`;
GRANT USE CATALOG, USE SCHEMA ON CATALOG staging TO `testers`;
GRANT USE CATALOG, USE SCHEMA, SELECT ON CATALOG production TO `analysts`;Pattern 2: Data Domain Organization
模式2:数据域组织
sql
-- Organize by business domain
CREATE CATALOG business_domains;
CREATE SCHEMA business_domains.customer_domain;
CREATE SCHEMA business_domains.financial_domain;
CREATE SCHEMA business_domains.product_domain;
-- Assign domain ownership
ALTER SCHEMA business_domains.customer_domain
SET OWNER TO `customer_data_team`;
-- Domain-specific access
GRANT SELECT ON SCHEMA business_domains.customer_domain TO `customer_analytics`;
GRANT MODIFY ON SCHEMA business_domains.customer_domain TO `customer_engineering`;sql
-- Organize by business domain
CREATE CATALOG business_domains;
CREATE SCHEMA business_domains.customer_domain;
CREATE SCHEMA business_domains.financial_domain;
CREATE SCHEMA business_domains.product_domain;
-- Assign domain ownership
ALTER SCHEMA business_domains.customer_domain
SET OWNER TO `customer_data_team`;
-- Domain-specific access
GRANT SELECT ON SCHEMA business_domains.customer_domain TO `customer_analytics`;
GRANT MODIFY ON SCHEMA business_domains.customer_domain TO `customer_engineering`;Pattern 3: Medallion with Security
模式3:带安全管控的Medallion架构
sql
-- Bronze: Raw data - limited access
CREATE SCHEMA production.bronze;
GRANT USE SCHEMA, SELECT, MODIFY ON SCHEMA production.bronze TO `data_engineers`;
-- Silver: Refined data - broader access
CREATE SCHEMA production.silver;
GRANT USE SCHEMA, SELECT ON SCHEMA production.silver TO `data_analysts`;
GRANT MODIFY ON SCHEMA production.silver TO `data_engineers`;
-- Gold: Business data - wide access
CREATE SCHEMA production.gold;
GRANT USE SCHEMA, SELECT ON SCHEMA production.gold TO `business_users`;
GRANT MODIFY ON SCHEMA production.gold TO `analytics_engineers`;sql
-- Bronze: Raw data - limited access
CREATE SCHEMA production.bronze;
GRANT USE SCHEMA, SELECT, MODIFY ON SCHEMA production.bronze TO `data_engineers`;
-- Silver: Refined data - broader access
CREATE SCHEMA production.silver;
GRANT USE SCHEMA, SELECT ON SCHEMA production.silver TO `data_analysts`;
GRANT MODIFY ON SCHEMA production.silver TO `data_engineers`;
-- Gold: Business data - wide access
CREATE SCHEMA production.gold;
GRANT USE SCHEMA, SELECT ON SCHEMA production.gold TO `business_users`;
GRANT MODIFY ON SCHEMA production.gold TO `analytics_engineers`;Pattern 4: Row-Level Security
模式4:行级安全
sql
-- Regional data access control
CREATE FUNCTION governance.regional_filter(user_region STRING, data_region STRING)
RETURNS BOOLEAN
RETURN CASE
WHEN IS_ACCOUNT_GROUP_MEMBER('global_access') THEN TRUE
WHEN user_region = data_region THEN TRUE
ELSE FALSE
END;
-- Apply row filter
ALTER TABLE production.customers.orders
SET ROW FILTER governance.regional_filter(current_user_region(), region);sql
-- Regional data access control
CREATE FUNCTION governance.regional_filter(user_region STRING, data_region STRING)
RETURNS BOOLEAN
RETURN CASE
WHEN IS_ACCOUNT_GROUP_MEMBER('global_access') THEN TRUE
WHEN user_region = data_region THEN TRUE
ELSE FALSE
END;
-- Apply row filter
ALTER TABLE production.customers.orders
SET ROW FILTER governance.regional_filter(current_user_region(), region);Pattern 5: Column Masking
模式5:列掩码
sql
-- Create masking function
CREATE FUNCTION governance.mask_email(email STRING)
RETURNS STRING
RETURN CASE
WHEN IS_ACCOUNT_GROUP_MEMBER('pii_admin') THEN email
WHEN IS_ACCOUNT_GROUP_MEMBER('pii_viewer') THEN
CONCAT(SUBSTRING(email, 1, 3), '***@', SPLIT(email, '@')[1])
ELSE 'REDACTED'
END;
-- Apply to column
ALTER TABLE production.customers.profiles
ALTER COLUMN email SET MASK governance.mask_email;sql
-- Create masking function
CREATE FUNCTION governance.mask_email(email STRING)
RETURNS STRING
RETURN CASE
WHEN IS_ACCOUNT_GROUP_MEMBER('pii_admin') THEN email
WHEN IS_ACCOUNT_GROUP_MEMBER('pii_viewer') THEN
CONCAT(SUBSTRING(email, 1, 3), '***@', SPLIT(email, '@')[1])
ELSE 'REDACTED'
END;
-- Apply to column
ALTER TABLE production.customers.profiles
ALTER COLUMN email SET MASK governance.mask_email;Pattern 6: Data Classification
模式6:数据分类
sql
-- Create classification tags
CREATE TAG IF NOT EXISTS governance.sensitivity
VALUES ('PUBLIC', 'INTERNAL', 'CONFIDENTIAL', 'RESTRICTED');
CREATE TAG IF NOT EXISTS governance.data_domain
VALUES ('CUSTOMER', 'FINANCIAL', 'EMPLOYEE', 'PRODUCT');
-- Apply classification
ALTER TABLE production.customers.profiles
SET TAGS (
'governance.sensitivity' = 'RESTRICTED',
'governance.data_domain' = 'CUSTOMER'
);
ALTER TABLE production.customers.profiles
ALTER COLUMN ssn SET TAGS ('governance.sensitivity' = 'RESTRICTED');sql
-- Create classification tags
CREATE TAG IF NOT EXISTS governance.sensitivity
VALUES ('PUBLIC', 'INTERNAL', 'CONFIDENTIAL', 'RESTRICTED');
CREATE TAG IF NOT EXISTS governance.data_domain
VALUES ('CUSTOMER', 'FINANCIAL', 'EMPLOYEE', 'PRODUCT');
-- Apply classification
ALTER TABLE production.customers.profiles
SET TAGS (
'governance.sensitivity' = 'RESTRICTED',
'governance.data_domain' = 'CUSTOMER'
);
ALTER TABLE production.customers.profiles
ALTER COLUMN ssn SET TAGS ('governance.sensitivity' = 'RESTRICTED');Security Best Practices
安全最佳实践
1. Least Privilege Access
1. 最小权限访问
sql
-- ❌ BAD: Overly permissive
GRANT ALL PRIVILEGES ON CATALOG production TO `all_users`;
-- ✅ GOOD: Minimal required permissions
GRANT USE CATALOG ON CATALOG production TO `analysts`;
GRANT USE SCHEMA ON SCHEMA production.gold TO `analysts`;
GRANT SELECT ON SCHEMA production.gold TO `analysts`;sql
-- ❌ BAD: Overly permissive
GRANT ALL PRIVILEGES ON CATALOG production TO `all_users`;
-- ✅ GOOD: Minimal required permissions
GRANT USE CATALOG ON CATALOG production TO `analysts`;
GRANT USE SCHEMA ON SCHEMA production.gold TO `analysts`;
GRANT SELECT ON SCHEMA production.gold TO `analysts`;2. Separation of Duties
2. 职责分离
python
undefinedpython
undefinedDefine distinct roles
Define distinct roles
ROLES = {
"data_consumer": ["USE CATALOG", "USE SCHEMA", "SELECT"],
"data_producer": ["USE CATALOG", "USE SCHEMA", "SELECT", "MODIFY"],
"data_steward": ["USE CATALOG", "USE SCHEMA", "SELECT", "MODIFY", "ALTER"],
"platform_admin": ["ALL PRIVILEGES"] # Minimal assignments
}
undefinedROLES = {
"data_consumer": ["USE CATALOG", "USE SCHEMA", "SELECT"],
"data_producer": ["USE CATALOG", "USE SCHEMA", "SELECT", "MODIFY"],
"data_steward": ["USE CATALOG", "USE SCHEMA", "SELECT", "MODIFY", "ALTER"],
"platform_admin": ["ALL PRIVILEGES"] # Minimal assignments
}
undefined3. Service Principal Ownership
3. 服务主体所有权
sql
-- Use service principals for ownership, not individuals
ALTER SCHEMA production.customers
SET OWNER TO `data-engineering-sp`;
-- Transfer from individual to service principal
ALTER TABLE production.customers.profiles
SET OWNER TO `customer-domain-sp`;sql
-- Use service principals for ownership, not individuals
ALTER SCHEMA production.customers
SET OWNER TO `data-engineering-sp`;
-- Transfer from individual to service principal
ALTER TABLE production.customers.profiles
SET OWNER TO `customer-domain-sp`;4. Audit and Monitor
4. 审计与监控
sql
-- Query audit logs
SELECT
user_identity.email,
request_params.full_name_arg as object_accessed,
action_name,
event_time
FROM system.access.audit
WHERE event_date >= CURRENT_DATE - INTERVAL 7 DAYS
AND action_name IN ('GET_TABLE', 'READ_TABLE')
ORDER BY event_time DESC;sql
-- Query audit logs
SELECT
user_identity.email,
request_params.full_name_arg as object_accessed,
action_name,
event_time
FROM system.access.audit
WHERE event_date >= CURRENT_DATE - INTERVAL 7 DAYS
AND action_name IN ('GET_TABLE', 'READ_TABLE')
ORDER BY event_time DESC;5. Regular Access Reviews
5. 定期访问评审
python
def quarterly_access_review(catalog: str):
"""Review and recertify access quarterly."""
# Get all grants
grants = spark.sql(f"SHOW GRANTS ON CATALOG {catalog}")
# Identify excessive access
excessive = grants.filter("privilege = 'ALL PRIVILEGES'")
# Find unused permissions
unused = find_unused_grants(grants)
# Generate review report
report = generate_review_report(excessive, unused)
send_to_data_stewards(report)python
def quarterly_access_review(catalog: str):
"""Review and recertify access quarterly."""
# Get all grants
grants = spark.sql(f"SHOW GRANTS ON CATALOG {catalog}")
# Identify excessive access
excessive = grants.filter("privilege = 'ALL PRIVILEGES'")
# Find unused permissions
unused = find_unused_grants(grants)
# Generate review report
report = generate_review_report(excessive, unused)
send_to_data_stewards(report)Compliance Frameworks
合规框架
GDPR Compliance Pattern
GDPR合规模式
python
class GDPRCompliance:
"""GDPR compliance implementation."""
def setup_data_inventory(self):
"""Article 30: Records of processing."""
# Tag all PII
self.classify_pii_data()
# Document processing purposes
self.document_purposes()
def implement_right_to_erasure(self):
"""Article 17: Right to erasure."""
# Create deletion workflow
self.create_deletion_api()
# Setup cascade delete
self.configure_lineage()
def enable_data_portability(self):
"""Article 20: Right to data portability."""
# Export user data
self.create_export_api()python
class GDPRCompliance:
"""GDPR compliance implementation."""
def setup_data_inventory(self):
"""Article 30: Records of processing."""
# Tag all PII
self.classify_pii_data()
# Document processing purposes
self.document_purposes()
def implement_right_to_erasure(self):
"""Article 17: Right to erasure."""
# Create deletion workflow
self.create_deletion_api()
# Setup cascade delete
self.configure_lineage()
def enable_data_portability(self):
"""Article 20: Right to data portability."""
# Export user data
self.create_export_api()HIPAA Compliance Pattern
HIPAA合规模式
python
class HIPAACompliance:
"""HIPAA compliance implementation."""
def implement_access_controls(self):
"""Technical Safeguards: Access Control."""
# Unique user identification
self.enforce_sso()
# Automatic logoff
self.configure_session_timeout()
# Audit controls
self.enable_comprehensive_logging()
def enable_encryption(self):
"""Technical Safeguards: Encryption."""
# Encryption at rest
self.enable_catalog_encryption()
# Encryption in transit
self.enforce_tls()python
class HIPAACompliance:
"""HIPAA compliance implementation."""
def implement_access_controls(self):
"""Technical Safeguards: Access Control."""
# Unique user identification
self.enforce_sso()
# Automatic logoff
self.configure_session_timeout()
# Audit controls
self.enable_comprehensive_logging()
def enable_encryption(self):
"""Technical Safeguards: Encryption."""
# Encryption at rest
self.enable_catalog_encryption()
# Encryption in transit
self.enforce_tls()Templates
模板
See directory for:
/templates/- governance-framework: Complete governance setup
- rbac-configuration: Role-based access control
- compliance-checklist: Regulatory compliance validation
- audit-procedures: Regular audit workflows
请查看目录获取:
/templates/- governance-framework:完整治理搭建方案
- rbac-configuration:角色型访问控制配置
- compliance-checklist:监管合规验证清单
- audit-procedures:定期审计流程
Examples
示例
See directory for:
/examples/- regional-access-control: Multi-region data isolation
- pii-protection: PII masking and encryption
- compliance-reporting: Automated compliance reports
请查看目录获取:
/examples/- regional-access-control:多区域数据隔离
- pii-protection:PII数据掩码与加密
- compliance-reporting:自动化合规报告
Related Skills
相关技能
- : Data classification and tagging
data-classification - : Automated compliance checks
compliance-automation - : RBAC/ABAC implementation
access-management
- :数据分类与标签
data-classification - :自动化合规检查
compliance-automation - :RBAC/ABAC实现
access-management