alibabacloud-alinux-sysom-inspection

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

SysOM 巡检(sysom-inspection)

SysOM Inspection (sysom-inspection)

在技能根目录执行
./scripts/osops.sh
当前实现命令:
  • inspection
Run
./scripts/osops.sh
in the skill root directory.
Currently implemented commands:
  • inspection

快速开始

Quick Start

bash
cd <alibabacloud-alinux-sysom-inspection>
./scripts/init.sh
./scripts/osops.sh inspection \
  --region-id cn-hangzhou \
  --instance-id i-xxx
bash
cd <alibabacloud-alinux-sysom-inspection>
./scripts/init.sh
./scripts/osops.sh inspection \
  --region-id cn-hangzhou \
  --instance-id i-xxx

执行逻辑

Execution Logic

  • 每次执行巡检前先调用 ROA 接口
    POST /api/v1/openapi/initial_sysom
    source=skill_hub
    ),用于判断用户是否具备权限且 SysOM 已开通。
  • 若未开通或角色未就绪,命令会交互式询问是否继续“开通+安装 SysOM”。
  • 用户同意后先调用
    InitialSysom(check_only=false, source=skill_hub)
    执行开通,再调用
    InstallAgentWithType
    安装。
  • 安装后会再次调用
    InitialSysom(check_only=true, source=skill_hub)
    复检,复检通过才继续巡检与诊断。
  • 不再本地配置阈值/事件规则,异常判断由服务端巡检报告决定。
  • 固定调用 ROA 巡检接口:
    POST /api/v1/inspection/createInstanceInspection
    ,并固定传
    source=skill_hub
  • 若需要巡检全部项目,可传
    items=[]
    (CLI 中为显式传空
    --inspection-items
    )。
  • 若标准巡检 API 返回
    InvalidAction.NotFound
    ,CLI 会标记“当前版本不可用”并停止后续流程,避免无效重试。
  • 报告查询调用 ROA 接口:
    GET /api/v1/inspection/getInspectionReport
  • 当创建接口不可用时,CLI 会补发一次
    GetInspectionReport
    探测调用并记录结果,确保日志中可观测到该动作。
  • 巡检报告中若命中
    sysom:metric:memory_usage_rate
    异常,自动调用
    InvokeDiagnosis
    发起
    memgraph
    诊断。
  • InvokeDiagnosis
    params
    会注入
    __sysom_diagnosis_source=skill_hub
    ,并校验业务
    code=Success
  • 发起诊断后自动轮询
    GetDiagnosisResult
    ,直到
    success
    /
    fail
    / 超时。
  • 可通过
    --disable-memgraph-diagnosis
    关闭自动诊断。
  • Before each inspection, call the ROA interface
    POST /api/v1/openapi/initial_sysom
    (with
    source=skill_hub
    ) to determine whether the user has permission and SysOM has been activated.
  • If it is not activated or the role is not ready, the command will interactively ask whether to proceed with "activation + installation of SysOM".
  • After the user agrees, first call
    InitialSysom(check_only=false, source=skill_hub)
    to perform activation, then call
    InstallAgentWithType
    for installation.
  • After installation, call
    InitialSysom(check_only=true, source=skill_hub)
    again for re-inspection, and only proceed with inspection and diagnosis if the re-inspection passes.
  • No local threshold/event rules are configured; anomaly judgment is determined by the server-side inspection report.
  • Fixed call to the ROA inspection interface:
    POST /api/v1/inspection/createInstanceInspection
    , with fixed parameter
    source=skill_hub
    .
  • To inspect all items, pass
    items=[]
    (explicitly pass empty
    --inspection-items
    in CLI).
  • If the standard inspection API returns
    InvalidAction.NotFound
    , the CLI will mark "current version unavailable" and stop subsequent processes to avoid invalid retries.
  • Report query calls the ROA interface:
    GET /api/v1/inspection/getInspectionReport
    .
  • When the creation interface is unavailable, the CLI will send an additional
    GetInspectionReport
    probe call and record the result to ensure the action is observable in the logs.
  • If
    sysom:metric:memory_usage_rate
    anomaly is detected in the inspection report, automatically call
    InvokeDiagnosis
    to initiate
    memgraph
    diagnosis.
  • The
    params
    of
    InvokeDiagnosis
    will inject
    __sysom_diagnosis_source=skill_hub
    , and verify the business
    code=Success
    .
  • After initiating diagnosis, automatically poll
    GetDiagnosisResult
    until
    success
    /
    fail
    / timeout.
  • Automatic diagnosis can be disabled via
    --disable-memgraph-diagnosis
    .

可扩展性约定

Extensibility Conventions

  • 巡检项可通过
    --inspection-items
    传入覆盖默认列表。
  • 若 InitialSysom 返回未开通,CLI 会在终端进行交互式确认后再执行开通尝试+重检。
  • 内存异常触发诊断的判定逻辑位于
    scripts/sysom_cli/inspection/command.py
  • 如需新增“巡检命中后触发的专项诊断”,可复用
    InvokeDiagnosis
    调用方式扩展。
  • Inspection items can be passed via
    --inspection-items
    to override the default list.
  • If InitialSysom returns not activated, the CLI will perform interactive confirmation in the terminal before attempting activation + re-inspection.
  • The judgment logic for triggering diagnosis due to memory anomalies is located in
    scripts/sysom_cli/inspection/command.py
    .
  • To add "special diagnosis triggered after inspection hits anomalies", you can reuse the
    InvokeDiagnosis
    call method for expansion.