data-manager

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Data Manager

数据管理者

Overview

概述

Manage data programs, governance operations, and data reliability. This skill covers data roadmaps, stakeholder coordination, metadata stewardship, lifecycle management, monitoring, incident response, capacity planning, and SLA frameworks.
管理数据项目、治理操作和数据可靠性。本技能涵盖数据路线图、利益相关方协调、metadata管理、生命周期管理、监控、事件响应、容量规划和SLA框架。

Features

功能特性

  • Data roadmap planning with stakeholder alignment and delivery cadence
  • Governance operations: stewardship, access reviews, lifecycle enforcement
  • Data ops monitoring with incident response and escalation paths
  • Team KPI/SLA scorecards and operational metrics
  • Cross-functional coordination across engineers, analysts, scientists, and legal
  • 结合利益相关方对齐和交付节奏的数据路线图规划
  • 治理操作:metadata管理、访问评审、生命周期执行
  • 包含事件响应和升级路径的data ops监控
  • 团队KPI/SLA计分卡和运营指标
  • 跨工程师、分析师、科学家和法务团队的跨职能协调

Usage

使用方法

  1. Identify the user's data management need (roadmap, governance, ops, or coordination)
  2. Follow the corresponding workflow below
  3. Produce structured outputs: roadmaps, governance policies, incident reports, or KPI dashboards
  1. 识别用户的数据管理需求(路线图、治理、运维或协调)
  2. 遵循下方对应的工作流程
  3. 生成结构化输出:路线图、治理政策、事件报告或KPI仪表盘

Examples

使用示例

  • User: "Create a data team roadmap" Agent: Runs Program Management workflow, produces quarterly roadmap with initiatives, dependencies, and stakeholder sign-offs
  • User: "Set up data governance" Agent: Runs Governance Operations workflow, defines stewardship roles, access review cadence, and lifecycle policies
  • User: "Handle a data incident" Agent: Runs Data Ops workflow, triages severity, executes runbook, produces post-incident report with action items
  • 用户:“创建数据团队路线图” Agent:执行项目管理工作流程,生成包含举措、依赖项和利益相关方签字的季度路线图
  • 用户:“建立数据治理体系” Agent:执行治理操作工作流程,定义metadata管理角色、访问评审节奏和生命周期政策
  • 用户:“处理数据事件” Agent:执行Data Ops工作流程,分级评估严重程度,执行运行手册,生成包含行动项的事后报告

When to Use

适用场景

  • Own the data roadmap, stakeholder reviews, and data product delivery cadence
  • Run governance operations (stewardship, access reviews, lifecycle enforcement)
  • Establish data ops monitoring, incident response, and team KPI/SLA scorecards
  • Coordinate engineers, analysts, scientists, and legal on cross-functional data work
  • 负责数据路线图、利益相关方评审和数据产品交付节奏
  • 执行治理操作(metadata管理、访问评审、生命周期执行)
  • 建立data ops监控、事件响应和团队KPI/SLA计分卡
  • 协调工程师、分析师、科学家和法务团队开展跨职能数据工作

When NOT to Use

不适用场景

  • Deep platform architecture ADRs or ontology design → use
    data-architect
    or
    ontology-engineer
  • Hands-on warehouse SQL optimization or SCD modeling → use
    data-warehouse-engineer
  • ML experimentation, model evaluation, or MLOps deployment → use
    data-scientist
  • Cloud VPC, Kubernetes, or IaC provisioning → use
    infrastructure-engineer
  • Company-wide multi-team technical programs (non-data) → use
    technical-program-manager
  • 深入的平台架构ADR或本体设计 → 使用
    data-architect
    ontology-engineer
  • 实操数据仓库SQL优化或SCD建模 → 使用
    data-warehouse-engineer
  • ML实验、模型评估或MLOps部署 → 使用
    data-scientist
  • 云VPC、Kubernetes或IaC配置 → 使用
    infrastructure-engineer
  • 公司级跨团队技术项目(非数据类) → 使用
    technical-program-manager

Core Workflows

核心工作流程

1. Data Program & Product Management

1. 数据项目与产品管理

Responsibilities:
  • Own the data roadmap aligned to business outcomes
  • Translate stakeholder needs into data product requirements
  • Coordinate cross-functional data work (engineers, analysts, scientists, legal)
Operational cadence:
MeetingFrequencyAttendeesPurpose
Data Leadership SyncWeeklyData leads, PMsBlockers, priorities, resource allocation
Stakeholder ReviewsBi-weeklyBusiness sponsorsRoadmap alignment, value demonstration
Sprint PlanningBi-weeklyEngineering teamCommitments, estimation, dependencies
RetrospectivesMonthlyFull data teamProcess improvements, team health
Data product delivery checklist:
  1. Define the business question and success criteria
  2. Identify data sources and validate availability/quality
  3. Design the data model (see
    data-architect
    skill)
  4. Build with observability (logging, lineage, tests)
  5. Validate with stakeholders before GA
  6. Document and train consumers
  7. Monitor usage and iterate
职责:
  • 负责与业务成果对齐的数据路线图
  • 将利益相关方需求转化为数据产品需求
  • 协调跨职能数据工作(工程师、分析师、科学家、法务)
运营节奏:
会议频率参会人员目的
数据领导层同步会每周数据负责人、PM障碍排查、优先级确定、资源分配
利益相关方评审会每两周业务发起人路线图对齐、价值展示
迭代规划会每两周工程团队任务承诺、工作量估算、依赖项梳理
回顾会每月整个数据团队流程改进、团队健康度评估
数据产品交付检查清单:
  1. 定义业务问题和成功标准
  2. 识别数据源并验证可用性/质量
  3. 设计数据模型(参见
    data-architect
    技能)
  4. 构建可观测的系统(日志、数据血缘、测试)
  5. GA前与利益相关方验证
  6. 文档编写并培训用户
  7. 监控使用情况并迭代优化

2. Governance Operations Execution

2. 治理操作执行

Core activities:
ActivityFrequencyOwnerOutput
Metadata stewardshipContinuousData stewardsEnriched catalog, documented lineage
Access reviewsQuarterlySecurity + ownersApproved access matrix
Data lifecycle enforcementMonthlyOperationsArchived/deleted per retention policy
Quality SLA reviewMonthlyGovernance leadQuality scorecard, remediation plan
Policy compliance auditQuarterlyAudit/complianceGap report, remediation tickets
Escalation paths:
  • Data incident → On-call engineer → Team lead → Director
  • Quality breach → Data steward → Governance committee → CDO
  • Access violation → Security team → Legal (if PII exposure)
核心活动:
活动频率负责人输出
Metadata管理持续进行数据管理者丰富的数据目录、文档化的数据血缘
访问评审每季度安全团队+数据所有者批准的访问矩阵
数据生命周期执行每月运营团队按保留策略归档/删除数据
质量SLA评审每月治理负责人质量计分卡、整改计划
政策合规审计每季度审计/合规团队差距报告、整改工单
升级路径:
  • 数据事件 → 值班工程师 → 团队负责人 → 总监
  • 质量违规 → 数据管理者 → 治理委员会 → CDO
  • 访问违规 → 安全团队 → 法务(若涉及PII泄露)

3. Data Operations & Reliability

3. 数据运维与可靠性

Monitoring stack:
LayerMetricsAlert Threshold
InfrastructureCPU, memory, disk, network>80% for 5 min
DatabaseConnections, lock waits, replication lagReplication lag >30s
PipelinesSuccess rate, duration, row counts<95% success rate
Data qualityNull rate, freshness, duplicatesSLA breach
CostDaily spend vs budget>110% of daily budget
Incident response phases:
  1. Detect: Alert fires or user reports issue
  2. Triage: Assess severity (P1-P4), assign owner
  3. Mitigate: Stop bleeding (rollback, redirect traffic)
  4. Resolve: Root cause fix deployed
  5. Review: Post-mortem within 48 hours for P1-P2
监控栈:
层级指标告警阈值
基础设施CPU、内存、磁盘、网络连续5分钟超过80%
数据库连接数、锁等待、复制延迟复制延迟超过30秒
数据管道成功率、时长、行数成功率低于95%
数据质量空值率、新鲜度、重复率违反SLA
成本每日支出vs预算超过每日预算的110%
事件响应阶段:
  1. 检测:触发告警或用户上报问题
  2. 分级:评估严重程度(P1-P4),分配负责人
  3. 缓解:止损(回滚、流量重定向)
  4. 解决:部署根本原因修复方案
  5. 复盘:P1-P2事件需在48小时内完成事后分析

4. Metrics & SLA Framework

4. 指标与SLA框架

Data team KPIs:
CategoryMetricTargetMeasurement
ReliabilityPipeline success rate>99%Airflow/Dagster logs
QualityData quality score>95%dbt tests + Great Expectations
FreshnessData latency (source → warehouse)<4 hoursPipeline metadata
CostCost per TB processedTrend downCloud billing
ProductivityTime from request to production<2 weeksJira/Asana cycle time
AdoptionActive data consumersGrow 10% QoQBI tool usage logs
SLA tiers:
TierDescriptionRTORPOExample
Tier 1Business-critical dashboards1 hour0Revenue reporting
Tier 2Operational analytics4 hours4 hoursMarketing attribution
Tier 3Research/exploratory24 hours24 hoursAd-hoc analysis
数据团队KPIs:
类别指标目标测量方式
可靠性数据管道成功率>99%Airflow/Dagster日志
质量数据质量得分>95%dbt测试 + Great Expectations
新鲜度数据延迟(源→仓库)<4小时数据管道metadata
成本每TB处理成本呈下降趋势云账单
生产力从需求到生产的时间<2周Jira/Asana周期时间
使用率活跃数据用户每季度增长10%BI工具使用日志
SLA层级:
层级描述RTORPO示例
Tier 1业务关键仪表盘1小时0收入报表
Tier 2运营分析4小时4小时营销归因
Tier 3研究/探索性分析24小时24小时临时分析