cx-cost-optimization

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Cost Optimization Skill

成本优化Skill

Use this skill when investigating or reducing Coralogix data costs. It covers the full cost management lifecycle: measuring current spend, reviewing TCO policies, adjusting retention periods, setting ingestion quotas, and configuring archive storage for cold data.

当需要调查或降低Coralogix数据成本时使用此Skill。它涵盖完整的成本管理生命周期:衡量当前支出、审核TCO策略、调整保留周期、设置摄入配额,以及为冷数据配置归档存储。

CLI Commands

CLI命令

CommandSubcommandsPurpose
cx usage
summary
,
daily
,
logs-count
,
spans-count
,
export-status
Measure current data consumption
cx tco
list
,
get
,
create
,
update
,
delete
,
reorder
,
test
,
settings
,
settings-update
Manage TCO (Total Cost of Ownership) policies
cx retentions
list
,
update
,
activate
,
status
Manage data retention periods
cx quotas
get
,
create
,
update
,
delete
Set ingestion guardrails
cx archive logs
get
,
set
Configure logs archive target
cx archive metrics
get
,
create
,
update
,
enable
,
disable
,
validate
Configure metrics archive storage
cx metrics query
<promql>
(positional),
--time
Query billing and usage metrics via PromQL (instant)
cx metrics query-range
<promql>
(positional),
--start
/
--end
Query billing and usage metrics via PromQL (range)
Key flags:
  • All commands support
    -o json
    for structured output and
    -p <profile>
    for profile selection
  • cx usage daily
    accepts
    --type processed-gbs|units|evaluation-tokens
    and
    --start
    /
    --end
    time filters
  • cx usage summary
    accepts
    --start
    /
    --end
    time filters
  • cx tco create/update
    ,
    cx retentions update
    ,
    cx quotas create/update
    ,
    cx archive logs set
    ,
    cx archive metrics create/update/validate
    use
    --from-file <path>
    (or
    -
    for stdin)

命令子命令用途
cx usage
summary
,
daily
,
logs-count
,
spans-count
,
export-status
衡量当前数据消耗
cx tco
list
,
get
,
create
,
update
,
delete
,
reorder
,
test
,
settings
,
settings-update
管理TCO(总拥有成本)策略
cx retentions
list
,
update
,
activate
,
status
管理数据保留周期
cx quotas
get
,
create
,
update
,
delete
设置摄入防护规则
cx archive logs
get
,
set
配置日志归档目标
cx archive metrics
get
,
create
,
update
,
enable
,
disable
,
validate
配置指标归档存储
cx metrics query
<promql>
(位置参数),
--time
通过PromQL查询计费和使用指标(即时查询)
cx metrics query-range
<promql>
(位置参数),
--start
/
--end
通过PromQL查询计费和使用指标(范围查询)
关键参数:
  • 所有命令均支持
    -o json
    以输出结构化数据,支持
    -p <profile>
    选择配置文件
  • cx usage daily
    接受
    --type processed-gbs|units|evaluation-tokens
    以及
    --start
    /
    --end
    时间过滤器
  • cx usage summary
    接受
    --start
    /
    --end
    时间过滤器
  • cx tco create/update
    cx retentions update
    cx quotas create/update
    cx archive logs set
    cx archive metrics create/update/validate
    支持
    --from-file <path>
    (或
    -
    表示标准输入)

Cost Investigation Workflow

成本调查工作流

Follow these steps to diagnose and reduce costs:
按照以下步骤诊断并降低成本:

Step 1: Measure Current Usage

步骤1:衡量当前使用情况

bash
cx usage summary -o json
cx usage summary --start now-30d -o json
cx usage daily --type processed-gbs --start now-7d -o json
cx usage logs-count -o json
cx usage spans-count -o json
Identify which data types consume the most volume. Use
jq
to sort:
bash
cx usage summary -o json | jq '[.[] | {name, daily_avg: .avg_daily_gb}] | sort_by(.daily_avg) | reverse'
bash
cx usage summary -o json
cx usage summary --start now-30d -o json
cx usage daily --type processed-gbs --start now-7d -o json
cx usage logs-count -o json
cx usage spans-count -o json
识别哪种数据类型消耗的容量最大。使用
jq
进行排序:
bash
cx usage summary -o json | jq '[.[] | {name, daily_avg: .avg_daily_gb}] | sort_by(.daily_avg) | reverse'

Step 2: Review TCO Policies

步骤2:审核TCO策略

bash
cx tco list -o json
cx tco settings -o json
TCO policies control which logs go to Frequent Search (expensive, fast) vs. Archive (cheap, slower). Check if high-volume, low-value logs are on Frequent Search:
bash
cx tco list -o json | jq '.[] | select(.priority == "LOW") | {name, application, subsystem, archive_retention}'
bash
cx tco list -o json
cx tco settings -o json
TCO策略控制哪些日志进入频繁搜索(成本高、速度快)还是归档(成本低、速度慢)。检查是否有高容量、低价值的日志处于频繁搜索层级:
bash
cx tco list -o json | jq '.[] | select(.priority == "LOW") | {name, application, subsystem, archive_retention}'

Step 3: Check Retention Settings

步骤3:检查保留设置

bash
cx retentions list -o json
cx retentions status -o json
Long retention periods increase storage costs. Identify indices with unnecessarily long retention.
bash
cx retentions list -o json
cx retentions status -o json
较长的保留周期会增加存储成本。识别保留周期过长的索引。

Step 4: Review Quota Rules

步骤4:审核配额规则

bash
cx quotas get -o json
Quota rules cap ingestion volume. If there are no quotas and you see burst ingestion, recommend adding guardrails.
bash
cx quotas get -o json
配额规则会限制摄入容量。如果没有设置配额且存在突发摄入情况,建议添加防护规则。

Step 5: Check Archive Configuration

步骤5:检查归档配置

bash
cx archive logs get -o json
cx archive metrics get -o json
Verify that archive storage is configured for cold data. If no archive is set up, that's a cost-saving opportunity.
bash
cx archive logs get -o json
cx archive metrics get -o json
验证是否为冷数据配置了归档存储。如果未设置归档,这是一个节省成本的机会。

Step 6: Recommend Optimizations

步骤6:推荐优化方案

Based on findings, recommend changes in priority order (highest impact first).

根据调查结果,按优先级顺序(影响从高到低)推荐更改措施。

Common Optimization Patterns

常见优化模式

SymptomDiagnosis CommandOptimization
High-volume low-value logs
cx usage summary -o json
Move to archive tier via
cx tco create --from-file policy.json
Long retention on cold data
cx retentions list -o json
Reduce retention with
cx retentions update --from-file
Burst ingestion spikes
cx usage daily -o json
Add quota rules with
cx quotas create --from-file
No cold storage configured
cx archive logs get -o json
Enable archive with
cx archive logs set --from-file --yes
(after user approval)
Expensive metrics not queried
cx archive metrics get -o json
Enable metrics archiving with
cx archive metrics create --from-file --yes
(after user approval)

症状诊断命令优化措施
高容量低价值日志
cx usage summary -o json
通过
cx tco create --from-file policy.json
移至归档层级
冷数据保留周期过长
cx retentions list -o json
使用
cx retentions update --from-file
缩短保留周期
突发摄入峰值
cx usage daily -o json
使用
cx quotas create --from-file
添加配额规则
未配置冷存储
cx archive logs get -o json
在获得用户批准后,使用
cx archive logs set --from-file --yes
启用归档
昂贵指标未被查询
cx archive metrics get -o json
在获得用户批准后,使用
cx archive metrics create --from-file --yes
启用指标归档

jq Examples

jq示例

Usage Analysis

使用情况分析

bash
undefined
bash
undefined

Top consumers by daily volume

按每日容量排序的顶级消耗者

cx usage summary -o json | jq '[.[] | {name, daily_avg: .avg_daily_gb}] | sort_by(.daily_avg) | reverse | .[0:10]'
cx usage summary -o json | jq '[.[] | {name, daily_avg: .avg_daily_gb}] | sort_by(.daily_avg) | reverse | .[0:10]'

Daily trend for the past week

过去一周的每日趋势

cx usage daily --type processed-gbs --start now-7d -o json | jq '[.[] | {date, gb: .processed_gbs}]'
cx usage daily --type processed-gbs --start now-7d -o json | jq '[.[] | {date, gb: .processed_gbs}]'

Total logs and spans counts

日志和追踪跨度的总数量

cx usage logs-count -o json | jq '.total_count' cx usage spans-count -o json | jq '.total_count'
undefined
cx usage logs-count -o json | jq '.total_count' cx usage spans-count -o json | jq '.total_count'
undefined

TCO Policy Analysis

TCO策略分析

bash
undefined
bash
undefined

Policies routing to archive tier

路由至归档层级的策略

cx tco list -o json | jq '[.[] | select(.archive_retention != null)]'
cx tco list -o json | jq '[.[] | select(.archive_retention != null)]'

Policies by priority

按优先级分组的策略

cx tco list -o json | jq 'group_by(.priority) | map({priority: .[0].priority, count: length})'
cx tco list -o json | jq 'group_by(.priority) | map({priority: .[0].priority, count: length})'

Test if a log pattern matches a policy

测试日志模式是否匹配策略

cx tco test --from-file test-definition.json -o json
undefined
cx tco test --from-file test-definition.json -o json
undefined

Retention Review

保留设置审核

bash
undefined
bash
undefined

All retention settings

所有保留设置

cx retentions list -o json | jq '.[]'
cx retentions list -o json | jq '.[]'

Check if retention is active

检查保留设置是否激活

cx retentions status -o json
undefined
cx retentions status -o json
undefined

Quota Analysis

配额分析

bash
undefined
bash
undefined

Current quota rules

当前配额规则

cx quotas get -o json | jq '.rules // empty'
undefined
cx quotas get -o json | jq '.rules // empty'
undefined

Archive Status

归档状态

bash
undefined
bash
undefined

Logs archive configuration

日志归档配置

cx archive logs get -o json | jq '{active: .active, bucket: .bucket}'
cx archive logs get -o json | jq '{active: .active, bucket: .bucket}'

Metrics archive configuration

指标归档配置

cx archive metrics get -o json | jq '{enabled: .enabled, bucket: .bucket}'

---
cx archive metrics get -o json | jq '{enabled: .enabled, bucket: .bucket}'

---

Applying Changes

应用更改

IMPORTANT: NEVER pass
--yes
without explicit user approval.
All write operations across archive, TCO, retentions, and quotas require interactive confirmation and the
--yes
flag to execute non-interactively. Before executing any write operation, describe the exact change to the user and wait for their approval before passing
--yes
.
Read-only mode: Use
--read-only
(or
CX_READ_ONLY=1
) to safely explore cost data without risk of accidental writes. All query commands (usage, tco list/get, retentions list, quotas get, archive get) work normally in read-only mode.
Agent mode: When running inside an AI agent, cx fails fast on write operations instead of hanging on a stdin prompt. Get user confirmation first, then re-run with
--yes
.
When modifying TCO policies, retention, quotas, or archive:
  1. Template from existing: Get the current configuration as JSON, modify it, then apply:
    bash
    cx tco get <policy-id> -o json > policy.json
    # Edit policy.json
    cx tco update --from-file policy.json
  2. Verify after changes: Re-run the diagnosis commands to confirm the change took effect.
  3. TCO policy ordering matters: Use
    cx tco reorder --from-file
    to set priority order. Policies are evaluated top-to-bottom; the first match wins.

重要提示:未经用户明确批准,切勿使用
--yes
参数。
所有涉及归档、TCO、保留设置和配额的写入操作都需要交互式确认,并且需要使用
--yes
参数才能以非交互式方式执行。在执行任何写入操作之前,向用户说明具体的更改内容,等待用户批准后再添加
--yes
参数。
只读模式: 使用
--read-only
(或
CX_READ_ONLY=1
)可以安全地探索成本数据,而无需担心意外写入。所有查询命令(usage、tco list/get、retentions list、quotas get、archive get)在只读模式下均可正常工作。
Agent模式: 在AI Agent中运行时,cx会在写入操作时快速失败,而不是挂起等待标准输入提示。先获得用户确认,然后添加
--yes
参数重新运行。
修改TCO策略、保留设置、配额或归档时:
  1. 从现有配置生成模板: 获取当前配置的JSON格式,修改后再应用:
    bash
    cx tco get <policy-id> -o json > policy.json
    # 编辑policy.json
    cx tco update --from-file policy.json
  2. 更改后验证: 重新运行诊断命令以确认更改已生效。
  3. TCO策略顺序很重要: 使用
    cx tco reorder --from-file
    设置优先级顺序。策略按从上到下的顺序评估,第一个匹配的策略生效。

Metrics-Based Cost Analysis

基于指标的成本分析

The
cx usage
API gives summaries, but for billing-accurate analysis, anomaly detection, and breakdown by pillar/feature, query the customer metrics exporter via PromQL.
cx usage
API提供汇总信息,但要进行准确的计费分析、异常检测以及按支柱/功能细分,需通过PromQL查询客户指标导出器。

Key Metrics

关键指标

MetricMeaningQuery suffix
cx_data_usage_units
Daily billable usage in units (canonical billing metric)No
_total
cx_data_plan_units_per_day
Current daily plan quota in units (snapshot)No
_total
cx_data_usage_payg_units
Daily overage/PAYG usage in unitsNo
_total
cx_data_usage_total
Processed data size in bytes
_total
cx_data_usage_tokens_total
AI evaluation tokens
_total
cx_data_usage_samples_total
Processed metric samples
_total
指标含义查询后缀
cx_data_usage_units
每日计费使用量(单位:计费单位,标准计费指标)
_total
cx_data_plan_units_per_day
当前每日套餐配额(单位:计费单位,快照)
_total
cx_data_usage_payg_units
每日超额/PAYG使用量(单位:计费单位)
_total
cx_data_usage_total
处理的数据大小(单位:字节)
_total
cx_data_usage_tokens_total
AI评估令牌
_total
cx_data_usage_samples_total
处理的指标样本
_total

Concept-to-Metric Mapping

概念与指标映射

  • Billing / plan usage / consumption ->
    cx_data_usage_units
    +
    cx_data_plan_units_per_day
  • Processed bytes / data volume ->
    cx_data_usage_total
  • AI evaluation tokens ->
    cx_data_usage_tokens_total
  • Metric samples ->
    cx_data_usage_samples_total
  • Overage / PAYG ->
    cx_data_usage_payg_units
  • 计费/套餐使用/消耗 ->
    cx_data_usage_units
    +
    cx_data_plan_units_per_day
  • 处理字节数/数据容量 ->
    cx_data_usage_total
  • AI评估令牌 ->
    cx_data_usage_tokens_total
  • 指标样本 ->
    cx_data_usage_samples_total
  • 超额使用/PAYG ->
    cx_data_usage_payg_units

Common PromQL Queries

常见PromQL查询

bash
undefined
bash
undefined

Today's billable units consumed so far

今日截至目前已消耗的计费单位

cx metrics query 'sum(cx_data_usage_units)' --time now -o json
cx metrics query 'sum(cx_data_usage_units)' --time now -o json

Units breakdown by pillar

按支柱细分的计费单位

cx metrics query 'sum by (pillar) (cx_data_usage_units)' --time now -o json
cx metrics query 'sum by (pillar) (cx_data_usage_units)' --time now -o json

Daily plan quota

每日套餐配额

cx metrics query 'cx_data_plan_units_per_day' --time now -o json
cx metrics query 'cx_data_plan_units_per_day' --time now -o json

Plan consumption percentage

套餐消耗百分比

cx metrics query '100 * sum(cx_data_usage_units) / cx_data_plan_units_per_day' --time now -o json
cx metrics query '100 * sum(cx_data_usage_units) / cx_data_plan_units_per_day' --time now -o json

Units by feature group

按功能组细分的计费单位

cx metrics query 'sum by (feature_group_id) (cx_data_usage_units)' --time now -o json
cx metrics query 'sum by (feature_group_id) (cx_data_usage_units)' --time now -o json

PAYG overage (if any)

PAYG超额使用量(如有)

cx metrics query 'cx_data_usage_payg_units' --time now -o json
undefined
cx metrics query 'cx_data_usage_payg_units' --time now -o json
undefined

UTC-Day Bucketing Rules

UTC日分桶规则

All usage metrics accumulate from UTC midnight and reset at
00:00 UTC
:
  • An instant query during the day returns "today so far"
  • For completed-day totals, use the last sample before midnight
  • Never subtract values across a UTC midnight boundary
  • For weekly/monthly analysis, derive completed daily totals first, then roll up
  • Exclude the current partial UTC day when computing trends or averages
所有使用指标从UTC午夜开始累积,并在
00:00 UTC
重置:
  • 白天的即时查询返回“今日截至目前”的数据
  • 要获取完整日期的总计,使用午夜前的最后一个样本
  • 切勿跨UTC午夜边界进行值的减法运算
  • 进行周/月分析时,先获取完整的每日总计,再进行汇总
  • 计算趋势或平均值时排除当前不完整的UTC日

Anomaly Detection

异常检测

When investigating usage anomalies:
  1. Compare completed UTC days (exclude current partial day)
  2. Break down by:
    measurement_type
    ->
    pillar
    ->
    entity_type
    ->
    priority
    ->
    feature_group_id
    ->
    application_name
    ->
    subsystem_name
  3. Prefer same-weekday comparisons for seasonal traffic
  4. Use
    cx_data_usage_units
    for billing anomalies,
    cx_data_usage_total
    for volume anomalies
调查使用异常时:
  1. 比较完整的UTC日数据(排除当前不完整的日期)
  2. 按以下维度细分:
    measurement_type
    ->
    pillar
    ->
    entity_type
    ->
    priority
    ->
    feature_group_id
    ->
    application_name
    ->
    subsystem_name
  3. 优先选择同工作日进行比较,以应对季节性流量
  4. 计费异常使用
    cx_data_usage_units
    ,容量异常使用
    cx_data_usage_total

Breakdown Labels

细分标签

Usage metrics support these grouping dimensions:
pillar
,
entity_type
,
priority
,
measurement_type
,
feature_group_id
,
feature_id
,
application_name
,
subsystem_name
.

使用指标支持以下分组维度:
pillar
entity_type
priority
measurement_type
feature_group_id
feature_id
application_name
subsystem_name

Key Principles

核心原则

  • Measure before changing - always run usage/summary commands before modifying policies
  • Use
    -o json
    with jq
    - structured output enables precise analysis
  • Verify changes - re-query after every modification to confirm it took effect
  • Multi-profile awareness - use
    -p <profile>
    or
    --all-profiles
    to compare costs across environments
  • Template from existing - get current config as JSON before creating or updating
  • TCO is the biggest lever - moving logs from Frequent Search to Archive tier has the largest cost impact

  • 先衡量再更改 - 修改策略前务必运行usage/summary命令
  • 结合
    -o json
    与jq使用
    - 结构化输出支持精准分析
  • 验证更改 - 每次修改后重新查询以确认生效
  • 多配置文件意识 - 使用
    -p <profile>
    --all-profiles
    比较不同环境的成本
  • 从现有配置生成模板 - 创建或更新前先获取当前配置的JSON格式
  • TCO是最大杠杆 - 将日志从频繁搜索移至归档层级对成本的影响最大

Related Skills

相关Skill

  • cx-telemetry-querying
    - investigate what data is being ingested (understand usage before cutting)
  • cx-query-logs
    - query logs to identify high-volume sources
  • cx-metrics-query
    - check metric cardinality and usage
  • cx-telemetry-querying
    - 调查正在摄入的数据(在削减使用量前先了解使用情况)
  • cx-query-logs
    - 查询日志以识别高容量来源
  • cx-metrics-query
    - 检查指标基数和使用情况