kibana-alerting-rules
Original:🇺🇸 English
Translated
Create and manage Kibana alerting rules via REST API or Terraform. Use when creating, updating, or managing rule lifecycle (enable, disable, mute, snooze) or rules-as-code workflows.
3installs
Sourceelastic/agent-skills
Added on
NPX Install
npx skill4agent add elastic/agent-skills kibana-alerting-rulesTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →Kibana Alerting Rules
Core Concepts
A rule has three parts: conditions (what to detect), schedule (how often to check), and actions (what
happens when conditions are met). When conditions are met, the rule creates alerts, which trigger actions via
connectors.
Authentication
All alerting API calls require either API key auth or Basic auth. Every mutating request must include the
header.
kbn-xsrfhttp
kbn-xsrf: trueRequired Privileges
- privileges for the appropriate Kibana feature (e.g., Stack Rules, Observability, Security)
all - privileges for Actions and Connectors (to attach actions to rules)
read
API Reference
Base path: (or for non-default spaces).
<kibana_url>/api/alerting/s/<space_id>/api/alerting| Operation | Method | Endpoint |
|---|---|---|
| Create rule | POST | |
| Update rule | PUT | |
| Get rule | GET | |
| Delete rule | DELETE | |
| Find rules | GET | |
| List rule types | GET | |
| Enable rule | POST | |
| Disable rule | POST | |
| Mute all alerts | POST | |
| Unmute all alerts | POST | |
| Mute alert | POST | |
| Unmute alert | POST | |
| Update API key | POST | |
| Create snooze | POST | |
| Delete snooze | DELETE | |
| Health check | GET | |
Creating a Rule
Required Fields
| Field | Type | Description |
|---|---|---|
| string | Display name (does not need to be unique) |
| string | The rule type (e.g., |
| string | Owning app: |
| object | Rule-type-specific parameters |
| object | Check interval, e.g., |
Optional Fields
| Field | Type | Description |
|---|---|---|
| array | Actions to run when conditions are met (each references a connector) |
| array | Tags for organizing rules |
| boolean | Whether the rule runs immediately (default: true) |
| string | |
| object | Alert only after N consecutive matches, e.g., |
| object/null | Override flapping detection settings |
Example: Create an Elasticsearch Query Rule
bash
curl -X POST "https://my-kibana:5601/api/alerting/rule/my-rule-id" \
-H "kbn-xsrf: true" \
-H "Content-Type: application/json" \
-H "Authorization: ApiKey <your-api-key>" \
-d '{
"name": "High error rate",
"rule_type_id": ".es-query",
"consumer": "stackAlerts",
"schedule": { "interval": "5m" },
"params": {
"index": ["logs-*"],
"timeField": "@timestamp",
"esQuery": "{\"query\":{\"match\":{\"log.level\":\"error\"}}}",
"threshold": [100],
"thresholdComparator": ">",
"timeWindowSize": 5,
"timeWindowUnit": "m",
"size": 100
},
"actions": [
{
"id": "my-slack-connector-id",
"group": "query matched",
"params": {
"message": "Alert: {{rule.name}} - {{context.hits}} hits detected"
},
"frequency": {
"summary": false,
"notify_when": "onActionGroupChange"
}
}
],
"tags": ["production", "errors"]
}'The same structure applies to other rule types — set the appropriate (e.g., ,
) and provide the matching object. Use to discover params schemas.
rule_type_id.index-threshold.es-queryparamsGET /api/alerting/rule_typesUpdating a Rule
PUT /api/alerting/rule/{id}rule_type_idconsumerFinding Rules
bash
curl -X GET "https://my-kibana:5601/api/alerting/rules/_find?per_page=20&page=1&search=cpu&sort_field=name&sort_order=asc" \
-H "Authorization: ApiKey <your-api-key>"Query parameters: , , , , , , ,
, , , .
per_pagepagesearchdefault_search_operatorsearch_fieldssort_fieldsort_orderhas_referencefieldsfilterfilter_consumersUse the parameter with KQL syntax for advanced queries:
filtertext
filter=alert.attributes.tags:"production"Lifecycle Operations
bash
# Enable
curl -X POST ".../api/alerting/rule/{id}/_enable" -H "kbn-xsrf: true"
# Disable
curl -X POST ".../api/alerting/rule/{id}/_disable" -H "kbn-xsrf: true"
# Mute all alerts
curl -X POST ".../api/alerting/rule/{id}/_mute_all" -H "kbn-xsrf: true"
# Mute specific alert
curl -X POST ".../api/alerting/rule/{rule_id}/alert/{alert_id}/_mute" -H "kbn-xsrf: true"
# Delete
curl -X DELETE ".../api/alerting/rule/{id}" -H "kbn-xsrf: true"Terraform Provider
Use the provider resource .
elasticstackelasticstack_kibana_alerting_rulehcl
terraform {
required_providers {
elasticstack = {
source = "elastic/elasticstack"
}
}
}
provider "elasticstack" {
kibana {
endpoints = ["https://my-kibana:5601"]
api_key = var.kibana_api_key
}
}
resource "elasticstack_kibana_alerting_rule" "cpu_alert" {
name = "CPU usage critical"
consumer = "stackAlerts"
rule_type_id = ".index-threshold"
interval = "1m"
enabled = true
params = jsonencode({
index = ["metrics-*"]
timeField = "@timestamp"
aggType = "avg"
aggField = "system.cpu.total.pct"
groupBy = "top"
termField = "host.name"
termSize = 10
threshold = [0.9]
thresholdComparator = ">"
timeWindowSize = 5
timeWindowUnit = "m"
})
tags = ["infrastructure", "production"]
}Key Terraform notes:
- must be passed as a JSON-encoded string via
paramsjsonencode() - Use data source or resource to reference connector IDs in actions
elasticstack_kibana_action_connector - Import existing rules: (use
terraform import elasticstack_kibana_alerting_rule.my_rule <space_id>/<rule_id>for the default space)default
Triggering Kibana Workflows from Rules
Preview feature — available from Elastic Stack 9.3 and Elastic Cloud Serverless. APIs may change.
Attach a workflow as a rule action using the workflow ID as the connector ID. Set — alert context flows
automatically through the object inside the workflow.
params: {}eventbash
curl -X PUT "https://my-kibana:5601/api/alerting/rule/my-rule-id" \
-H "kbn-xsrf: true" \
-H "Content-Type: application/json" \
-H "Authorization: ApiKey <your-api-key>" \
-d '{
"name": "High error rate",
"schedule": { "interval": "5m" },
"params": { ... },
"actions": [
{
"id": "<workflow-id>",
"group": "query matched",
"params": {},
"frequency": { "summary": false, "notify_when": "onActionGroupChange" }
}
]
}'In the UI: Stack Management > Rules > Actions > Workflows. Only workflows appear in the picker.
enabled: trueFor workflow YAML structure, context fields, step types, and patterns, refer to the
skill if available.
{{ event }}kibana-connectorsConnectors and Actions in Rules
Each action references a connector by ID, an action , action (using Mustache templates), and a
per-action object. Key fields:
groupparamsfrequency- — which trigger state fires this action (e.g.,
group,"query matched"). Discover valid groups via"Recovered".GET /api/alerting/rule_types - —
frequency.summaryfor a digest of all alerts;truefor per-alert.false - —
frequency.notify_when|onActionGroupChange|onActiveAlert.onThrottleInterval - — minimum repeat interval (e.g.,
frequency.throttle); only applies with"10m".onThrottleInterval
For full reference on action structure, Mustache variables (, , ),
Mustache lambdas (, , ), recovery actions, and multi-channel patterns, refer to the
skill if available.
{{rule.name}}{{context.*}}{{alerts.new.count}}EvalMathFormatDateParseHjsonkibana-connectorsBest Practices
-
Set action frequency per action, not per rule. Thefield at the rule level is deprecated in favor of per-action
notify_whenobjects. If you set it at the rule level and later edit the rule in the Kibana UI, it is automatically converted to action-level values.frequency -
Use alert summaries to reduce notification noise. Instead of sending one notification per alert, configure actions to send periodic summaries at a custom interval. Useand set a
"summary": trueinterval. This is especially valuable for rules that monitor many hosts or documents.throttle -
Choose the right action frequency for each channel. Usefor paging/ticketing systems (fire once, resolve once). Use
onActionGroupChangefor audit logging to an Index connector. UseonActiveAlertwith a throttle likeonThrottleIntervalfor dashboards or lower-priority notifications."30m" -
Always add a recovery action. Rules without a recovery action leave incidents open in PagerDuty, Jira, and ServiceNow indefinitely. Use the connector's native close/resolve event action (e.g.,for PagerDuty) in the
eventAction: "resolve"action group.Recovered -
Set a reasonable check interval. The minimum recommended interval is. Very short intervals across many rules clog Task Manager throughput and increase schedule drift. The server setting
1menforces this.xpack.alerting.rules.minimumScheduleInterval.value -
Useto suppress transient spikes. Setting
alert_delaymeans the alert only fires after 3 consecutive runs match the condition, filtering out brief anomalies.{"active": 3} -
Enable flapping detection. Alerts that rapidly switch between active and recovered are marked as "flapping" and notifications are suppressed. This is on by default but can be tuned per-rule with theobject.
flapping -
Usefor deep links. Set
server.publicBaseUrlinserver.publicBaseUrlso thatkibana.ymland{{rule.url}}variables resolve to valid URLs in notifications.{{kibanaBaseUrl}} -
Tag rules consistently. Use tags like,
production,stagingfor filtering and organization in the Find API and UI.team-platform -
Use Kibana Spaces to isolate rules by team or environment. Prefix API paths withfor non-default spaces. Connectors are also space-scoped, so create matching connectors in each space.
/s/<space_id>/
Common Pitfalls
-
Missingheader. All POST, PUT, DELETE requests require
kbn-xsrfor any truthy value. Omitting it returns a 400 error.kbn-xsrf: true -
Wrongvalue. Using an invalid consumer (e.g.,
consumerinstead ofobservability) causes a 400 error. Check the rule type's supported consumers viainfrastructure.GET /api/alerting/rule_types -
Immutable fields on update. You cannot changeor
rule_type_idwith PUT. You must delete and recreate the rule.consumer -
Rule-leveland
notify_whenare deprecated. Setting these at the rule level still works but conflicts with action-level frequency settings. Always usethrottleinside each action object.frequency -
Rule ID conflicts. POST towith an existing ID returns 409. Either omit the ID to auto-generate, or check existence first.
/api/alerting/rule/{id} -
API key ownership. Rules run using the API key of the user who created or last updated them. If that user's permissions change or the user is deleted, the rule may fail silently. Useto re-associate.
_update_api_key -
Too many actions per rule. Rules generating thousands of alerts with multiple actions can clog Task Manager. The server setting(default varies) limits actions per run. Design rules to use alert summaries or limit term sizes.
xpack.alerting.rules.run.actions.max -
Long-running rules. Rules that run expensive queries are cancelled after(default
xpack.alerting.rules.run.timeout). When cancelled, all alerts and actions from that run are discarded. Optimize queries or increase the timeout for specific rule types.5m -
Concurrent update conflicts. PUT returns 409 if the rule was modified by another user since you last read it. Always GET the latest version before updating.
-
Import/export loses secrets. Rules exported via Saved Objects are disabled on import. Connectors lose their secrets and must be re-configured.
Examples
Create a threshold alert: "Alert me when CPU exceeds 90% on any host for 5 minutes." Use
, , , and .
Attach a PagerDuty action on and a matching action to auto-close incidents.
rule_type_id: ".index-threshold"aggField: "system.cpu.total.pct"threshold: [0.9]timeWindowSize: 5"threshold met"RecoveredFind rules by tag: "Show all production alerting rules." with
and to page through results.
GET /api/alerting/rules/_findfilter=alert.attributes.tags:"production"sort_field=namePause a rule temporarily: "Disable rule abc123 until next Monday." .
Re-enable with when ready; the rule retains all configuration while disabled.
POST /api/alerting/rule/abc123/_disable_enableGuidelines
- Include on every POST, PUT, and DELETE; omitting it returns 400.
kbn-xsrf: true - Set inside each action object — rule-level
frequencyandnotify_whenare deprecated.throttle - and
rule_type_idare immutable after creation; delete and recreate the rule to change them.consumer - Prefix paths with for non-default Kibana Spaces.
/s/<space_id>/api/alerting/ - Always pair an active action with a action to auto-close PagerDuty, Jira, and ServiceNow incidents.
Recovered - Run first to discover valid
GET /api/alerting/rule_typesvalues and action group names.consumer - Use to suppress transient spikes; use the
alert_delayobject to reduce noise from unstable conditions.flapping