datadog-automation

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Datadog Automation via Rube MCP

通过Rube MCP实现Datadog自动化

Automate Datadog monitoring and observability operations through Composio's Datadog toolkit via Rube MCP.
通过Composio的Datadog工具包,借助Rube MCP自动化Datadog监控与可观测性操作。

Prerequisites

前提条件

  • Rube MCP must be connected (RUBE_SEARCH_TOOLS available)
  • Active Datadog connection via
    RUBE_MANAGE_CONNECTIONS
    with toolkit
    datadog
  • Always call
    RUBE_SEARCH_TOOLS
    first to get current tool schemas
  • 必须已连接Rube MCP(需提供RUBE_SEARCH_TOOLS)
  • 通过
    RUBE_MANAGE_CONNECTIONS
    激活Datadog连接,工具包为
    datadog
  • 请始终先调用
    RUBE_SEARCH_TOOLS
    以获取最新的工具架构

Setup

设置步骤

Get Rube MCP: Add
https://rube.app/mcp
as an MCP server in your client configuration. No API keys needed — just add the endpoint and it works.
  1. Verify Rube MCP is available by confirming
    RUBE_SEARCH_TOOLS
    responds
  2. Call
    RUBE_MANAGE_CONNECTIONS
    with toolkit
    datadog
  3. If connection is not ACTIVE, follow the returned auth link to complete Datadog authentication
  4. Confirm connection status shows ACTIVE before running any workflows
获取Rube MCP:在客户端配置中添加
https://rube.app/mcp
作为MCP服务器。无需API密钥——只需添加端点即可使用。
  1. 确认
    RUBE_SEARCH_TOOLS
    可响应,以此验证Rube MCP是否可用
  2. 调用
    RUBE_MANAGE_CONNECTIONS
    ,指定工具包为
    datadog
  3. 如果连接未处于ACTIVE状态,请按照返回的认证链接完成Datadog身份验证
  4. 在运行任何工作流之前,请确认连接状态显示为ACTIVE

Core Workflows

核心工作流

1. Query and Explore Metrics

1. 查询与探索指标

When to use: User wants to query metric data or list available metrics
Tool sequence:
  1. DATADOG_LIST_METRICS
    - List available metric names [Optional]
  2. DATADOG_QUERY_METRICS
    - Query metric time series data [Required]
Key parameters:
  • query
    : Datadog metric query string (e.g.,
    avg:system.cpu.user{host:web01}
    )
  • from
    : Start timestamp (Unix epoch seconds)
  • to
    : End timestamp (Unix epoch seconds)
  • q
    : Search string for listing metrics
Pitfalls:
  • Query syntax follows Datadog's metric query format:
    aggregation:metric_name{tag_filters}
  • from
    and
    to
    are Unix epoch timestamps in seconds, not milliseconds
  • Valid aggregations:
    avg
    ,
    sum
    ,
    min
    ,
    max
    ,
    count
  • Tag filters use curly braces:
    {host:web01,env:prod}
  • Time range should not exceed Datadog's retention limits for the metric type
适用场景:用户需要查询指标数据或列出可用指标
工具序列:
  1. DATADOG_LIST_METRICS
    - 列出可用指标名称(可选)
  2. DATADOG_QUERY_METRICS
    - 查询指标时间序列数据(必填)
关键参数:
  • query
    : Datadog指标查询字符串(例如:
    avg:system.cpu.user{host:web01}
  • from
    : 开始时间戳(Unix纪元秒)
  • to
    : 结束时间戳(Unix纪元秒)
  • q
    : 用于列出指标的搜索字符串
注意事项:
  • 查询语法遵循Datadog的指标查询格式:
    aggregation:metric_name{tag_filters}
  • from
    to
    为Unix纪元秒级时间戳,而非毫秒
  • 有效的聚合函数:
    avg
    sum
    min
    max
    count
  • 标签过滤器使用大括号:
    {host:web01,env:prod}
  • 时间范围不应超过Datadog对应指标类型的保留期限

2. Search and Analyze Logs

2. 搜索与分析日志

When to use: User wants to search log entries or list log indexes
Tool sequence:
  1. DATADOG_LIST_LOG_INDEXES
    - List available log indexes [Optional]
  2. DATADOG_SEARCH_LOGS
    - Search logs with query and filters [Required]
Key parameters:
  • query
    : Log search query using Datadog log query syntax
  • from
    : Start time (ISO 8601 or Unix timestamp)
  • to
    : End time (ISO 8601 or Unix timestamp)
  • sort
    : Sort order ('asc' or 'desc')
  • limit
    : Number of log entries to return
Pitfalls:
  • Log queries use Datadog's log search syntax:
    service:web status:error
  • Search is limited to retained logs within the configured retention period
  • Large result sets require pagination; check for cursor/page tokens
  • Log indexes control routing and retention; filter by index if known
适用场景:用户需要搜索日志条目或列出日志索引
工具序列:
  1. DATADOG_LIST_LOG_INDEXES
    - 列出可用日志索引(可选)
  2. DATADOG_SEARCH_LOGS
    - 使用查询与过滤器搜索日志(必填)
关键参数:
  • query
    : 使用Datadog日志查询语法的搜索语句
  • from
    : 开始时间(ISO 8601或Unix时间戳)
  • to
    : 结束时间(ISO 8601或Unix时间戳)
  • sort
    : 排序顺序('asc'或'desc')
  • limit
    : 要返回的日志条目数量
注意事项:
  • 日志查询使用Datadog的日志搜索语法:
    service:web status:error
  • 搜索范围仅限已配置保留期内的日志
  • 大量结果需要分页;请检查游标/分页令牌
  • 日志索引控制路由与保留策略;若已知索引可通过其过滤

3. Manage Monitors

3. 管理监控器

When to use: User wants to create, update, mute, or inspect monitors
Tool sequence:
  1. DATADOG_LIST_MONITORS
    - List all monitors with filters [Required]
  2. DATADOG_GET_MONITOR
    - Get specific monitor details [Optional]
  3. DATADOG_CREATE_MONITOR
    - Create a new monitor [Optional]
  4. DATADOG_UPDATE_MONITOR
    - Update monitor configuration [Optional]
  5. DATADOG_MUTE_MONITOR
    - Silence a monitor temporarily [Optional]
  6. DATADOG_UNMUTE_MONITOR
    - Re-enable a muted monitor [Optional]
Key parameters:
  • monitor_id
    : Numeric monitor ID
  • name
    : Monitor display name
  • type
    : Monitor type ('metric alert', 'service check', 'log alert', 'query alert', etc.)
  • query
    : Monitor query defining the alert condition
  • message
    : Notification message with @mentions
  • tags
    : Array of tag strings
  • thresholds
    : Alert threshold values (
    critical
    ,
    warning
    ,
    ok
    )
Pitfalls:
  • Monitor
    type
    must match the query type; mismatches cause creation failures
  • message
    supports @mentions for notifications (e.g.,
    @slack-channel
    ,
    @pagerduty
    )
  • Thresholds vary by monitor type; metric monitors need
    critical
    at minimum
  • Muting a monitor suppresses notifications but the monitor still evaluates
  • Monitor IDs are numeric integers
适用场景:用户需要创建、更新、静音或查看监控器
工具序列:
  1. DATADOG_LIST_MONITORS
    - 带过滤器列出所有监控器(必填)
  2. DATADOG_GET_MONITOR
    - 获取特定监控器详情(可选)
  3. DATADOG_CREATE_MONITOR
    - 创建新监控器(可选)
  4. DATADOG_UPDATE_MONITOR
    - 更新监控器配置(可选)
  5. DATADOG_MUTE_MONITOR
    - 临时静音监控器(可选)
  6. DATADOG_UNMUTE_MONITOR
    - 重新启用已静音的监控器(可选)
关键参数:
  • monitor_id
    : 数字型监控器ID
  • name
    : 监控器显示名称
  • type
    : 监控器类型('metric alert'、'service check'、'log alert'、'query alert'等)
  • query
    : 定义告警条件的监控器查询语句
  • message
    : 包含@提及的通知消息
  • tags
    : 标签字符串数组
  • thresholds
    : 告警阈值(
    critical
    warning
    ok
注意事项:
  • 监控器
    type
    必须与查询类型匹配;不匹配会导致创建失败
  • message
    支持@提及以发送通知(例如:
    @slack-channel
    @pagerduty
  • 阈值因监控器类型而异;指标监控器至少需要设置
    critical
    阈值
  • 静音监控器会抑制通知,但监控器仍会继续执行评估
  • 监控器ID为数字整数

4. Manage Dashboards

4. 管理仪表板

When to use: User wants to list, view, update, or delete dashboards
Tool sequence:
  1. DATADOG_LIST_DASHBOARDS
    - List all dashboards [Required]
  2. DATADOG_GET_DASHBOARD
    - Get full dashboard definition [Optional]
  3. DATADOG_UPDATE_DASHBOARD
    - Update dashboard layout or widgets [Optional]
  4. DATADOG_DELETE_DASHBOARD
    - Remove a dashboard (irreversible) [Optional]
Key parameters:
  • dashboard_id
    : Dashboard identifier string
  • title
    : Dashboard title
  • layout_type
    : 'ordered' (grid) or 'free' (freeform positioning)
  • widgets
    : Array of widget definition objects
  • description
    : Dashboard description
Pitfalls:
  • Dashboard IDs are alphanumeric strings (e.g., 'abc-def-ghi'), not numeric
  • layout_type
    cannot be changed after creation; must recreate the dashboard
  • Widget definitions are complex nested objects; get existing dashboard first to understand structure
  • DELETE is permanent; there is no undo
适用场景:用户需要列出、查看、更新或删除仪表板
工具序列:
  1. DATADOG_LIST_DASHBOARDS
    - 列出所有仪表板(必填)
  2. DATADOG_GET_DASHBOARD
    - 获取完整的仪表板定义(可选)
  3. DATADOG_UPDATE_DASHBOARD
    - 更新仪表板布局或组件(可选)
  4. DATADOG_DELETE_DASHBOARD
    - 删除仪表板(不可恢复)(可选)
关键参数:
  • dashboard_id
    : 仪表板标识符字符串
  • title
    : 仪表板标题
  • layout_type
    : 'ordered'(网格布局)或'free'(自由定位)
  • widgets
    : 组件定义对象数组
  • description
    : 仪表板描述
注意事项:
  • 仪表板ID为字母数字字符串(例如:'abc-def-ghi'),而非数字
  • layout_type
    创建后无法更改;如需更改需重新创建仪表板
  • 组件定义为复杂的嵌套对象;请先获取现有仪表板以了解其结构
  • DELETE操作是永久性的;无法撤销

5. Create Events and Manage Downtimes

5. 创建事件与管理停机时间

When to use: User wants to post events or schedule maintenance downtimes
Tool sequence:
  1. DATADOG_LIST_EVENTS
    - List existing events [Optional]
  2. DATADOG_CREATE_EVENT
    - Post a new event [Required]
  3. DATADOG_CREATE_DOWNTIME
    - Schedule a maintenance downtime [Optional]
Key parameters for events:
  • title
    : Event title
  • text
    : Event body text (supports markdown)
  • alert_type
    : Event severity ('error', 'warning', 'info', 'success')
  • tags
    : Array of tag strings
Key parameters for downtimes:
  • scope
    : Tag scope for the downtime (e.g.,
    host:web01
    )
  • start
    : Start time (Unix epoch)
  • end
    : End time (Unix epoch; omit for indefinite)
  • message
    : Downtime description
  • monitor_id
    : Specific monitor to downtime (optional, omit for scope-based)
Pitfalls:
  • Event
    text
    supports Datadog's markdown format including @mentions
  • Downtimes scope uses tag syntax:
    host:web01
    ,
    env:staging
  • Omitting
    end
    creates an indefinite downtime; always set an end time for maintenance
  • Downtime
    monitor_id
    narrows to a single monitor; scope applies to all matching monitors
适用场景:用户需要发布事件或安排维护停机时间
工具序列:
  1. DATADOG_LIST_EVENTS
    - 列出已有事件(可选)
  2. DATADOG_CREATE_EVENT
    - 发布新事件(必填)
  3. DATADOG_CREATE_DOWNTIME
    - 安排维护停机时间(可选)
事件关键参数:
  • title
    : 事件标题
  • text
    : 事件正文(支持Markdown)
  • alert_type
    : 事件严重级别('error'、'warning'、'info'、'success')
  • tags
    : 标签字符串数组
停机时间关键参数:
  • scope
    : 停机时间的标签范围(例如:
    host:web01
  • start
    : 开始时间(Unix纪元)
  • end
    : 结束时间(Unix纪元;留空则为无限期)
  • message
    : 停机时间描述
  • monitor_id
    : 要设置停机的特定监控器(可选,留空则基于范围)
注意事项:
  • 事件
    text
    支持Datadog的Markdown格式,包括@提及
  • 停机时间范围使用标签语法:
    host:web01
    env:staging
  • 若省略
    end
    则创建无限期停机时间;维护任务请务必设置结束时间
  • 停机时间
    monitor_id
    仅针对单个监控器;范围则适用于所有匹配的监控器

6. Manage Hosts and Traces

6. 管理主机与追踪

When to use: User wants to list infrastructure hosts or inspect distributed traces
Tool sequence:
  1. DATADOG_LIST_HOSTS
    - List all reporting hosts [Required]
  2. DATADOG_GET_TRACE_BY_ID
    - Get a specific distributed trace [Optional]
Key parameters:
  • filter
    : Host search filter string
  • sort_field
    : Sort hosts by field (e.g., 'name', 'apps', 'cpu')
  • sort_dir
    : Sort direction ('asc' or 'desc')
  • trace_id
    : Distributed trace ID for trace lookup
Pitfalls:
  • Host list includes all hosts reporting to Datadog within the retention window
  • Trace IDs are long numeric strings; ensure exact match
  • Hosts that stop reporting are retained for a configured period before removal
适用场景:用户需要列出基础设施主机或查看分布式追踪
工具序列:
  1. DATADOG_LIST_HOSTS
    - 列出所有上报的主机(必填)
  2. DATADOG_GET_TRACE_BY_ID
    - 获取特定分布式追踪(可选)
关键参数:
  • filter
    : 主机搜索过滤器字符串
  • sort_field
    : 主机排序字段(例如:'name'、'apps'、'cpu')
  • sort_dir
    : 排序方向('asc'或'desc')
  • trace_id
    : 用于查找追踪的分布式追踪ID
注意事项:
  • 主机列表包含在保留期内上报至Datadog的所有主机
  • 追踪ID为长数字字符串;请确保完全匹配
  • 停止上报的主机会在配置的保留期后被移除

Common Patterns

通用模式

Monitor Query Syntax

监控器查询语法

Metric alerts:
avg(last_5m):avg:system.cpu.user{env:prod} > 90
Log alerts:
logs("service:web status:error").index("main").rollup("count").last("5m") > 10
指标告警:
avg(last_5m):avg:system.cpu.user{env:prod} > 90
日志告警:
logs("service:web status:error").index("main").rollup("count").last("5m") > 10

Tag Filtering

标签过滤

  • Tags use
    key:value
    format:
    host:web01
    ,
    env:prod
    ,
    service:api
  • Multiple tags:
    {host:web01,env:prod}
    (AND logic)
  • Wildcard:
    host:web*
  • 标签使用
    key:value
    格式:
    host:web01
    env:prod
    service:api
  • 多标签:
    {host:web01,env:prod}
    (逻辑与)
  • 通配符:
    host:web*

Pagination

分页

  • Use
    page
    and
    page_size
    or offset-based pagination depending on endpoint
  • Check response for total count to determine if more pages exist
  • Continue until all results are retrieved
  • 根据端点不同,使用
    page
    page_size
    或基于偏移量的分页
  • 检查响应中的总计数以确定是否存在更多页面
  • 持续请求直到获取所有结果

Known Pitfalls

已知注意事项

Timestamps:
  • Most endpoints use Unix epoch seconds (not milliseconds)
  • Some endpoints accept ISO 8601; check tool schema
  • Time ranges should be reasonable (not years of data)
Query Syntax:
  • Metric queries:
    aggregation:metric{tags}
  • Log queries:
    field:value
    pairs
  • Monitor queries vary by type; check Datadog documentation
Rate Limits:
  • Datadog API has per-endpoint rate limits
  • Implement backoff on 429 responses
  • Batch operations where possible
时间戳:
  • 大多数端点使用Unix纪元秒(而非毫秒)
  • 部分端点支持ISO 8601;请查看工具架构
  • 时间范围应合理(不要请求数年的数据)
查询语法:
  • 指标查询:
    aggregation:metric{tags}
  • 日志查询:
    field:value
    键值对
  • 监控器查询因类型而异;请查看Datadog文档
速率限制:
  • Datadog API对不同端点有速率限制
  • 收到429响应时请实现退避机制
  • 尽可能批量操作

Quick Reference

快速参考

TaskTool SlugKey Params
Query metricsDATADOG_QUERY_METRICSquery, from, to
List metricsDATADOG_LIST_METRICSq
Search logsDATADOG_SEARCH_LOGSquery, from, to, limit
List log indexesDATADOG_LIST_LOG_INDEXES(none)
List monitorsDATADOG_LIST_MONITORStags
Get monitorDATADOG_GET_MONITORmonitor_id
Create monitorDATADOG_CREATE_MONITORname, type, query, message
Update monitorDATADOG_UPDATE_MONITORmonitor_id
Mute monitorDATADOG_MUTE_MONITORmonitor_id
Unmute monitorDATADOG_UNMUTE_MONITORmonitor_id
List dashboardsDATADOG_LIST_DASHBOARDS(none)
Get dashboardDATADOG_GET_DASHBOARDdashboard_id
Update dashboardDATADOG_UPDATE_DASHBOARDdashboard_id, title, widgets
Delete dashboardDATADOG_DELETE_DASHBOARDdashboard_id
List eventsDATADOG_LIST_EVENTSstart, end
Create eventDATADOG_CREATE_EVENTtitle, text, alert_type
Create downtimeDATADOG_CREATE_DOWNTIMEscope, start, end
List hostsDATADOG_LIST_HOSTSfilter, sort_field
Get traceDATADOG_GET_TRACE_BY_IDtrace_id
任务工具标识关键参数
查询指标DATADOG_QUERY_METRICSquery, from, to
列出指标DATADOG_LIST_METRICSq
搜索日志DATADOG_SEARCH_LOGSquery, from, to, limit
列出日志索引DATADOG_LIST_LOG_INDEXES
列出监控器DATADOG_LIST_MONITORStags
获取监控器DATADOG_GET_MONITORmonitor_id
创建监控器DATADOG_CREATE_MONITORname, type, query, message
更新监控器DATADOG_UPDATE_MONITORmonitor_id
静音监控器DATADOG_MUTE_MONITORmonitor_id
取消静音监控器DATADOG_UNMUTE_MONITORmonitor_id
列出仪表板DATADOG_LIST_DASHBOARDS
获取仪表板DATADOG_GET_DASHBOARDdashboard_id
更新仪表板DATADOG_UPDATE_DASHBOARDdashboard_id, title, widgets
删除仪表板DATADOG_DELETE_DASHBOARDdashboard_id
列出事件DATADOG_LIST_EVENTSstart, end
创建事件DATADOG_CREATE_EVENTtitle, text, alert_type
创建停机时间DATADOG_CREATE_DOWNTIMEscope, start, end
列出主机DATADOG_LIST_HOSTSfilter, sort_field
获取追踪DATADOG_GET_TRACE_BY_IDtrace_id