alibabacloud-alinux-sysom-inspection
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSysOM 巡检(sysom-inspection)
SysOM Inspection (sysom-inspection)
在技能根目录执行 。
./scripts/osops.sh当前实现命令:
inspection
Run in the skill root directory.
./scripts/osops.shCurrently implemented commands:
inspection
快速开始
Quick Start
bash
cd <alibabacloud-alinux-sysom-inspection>
./scripts/init.sh
./scripts/osops.sh inspection \
--region-id cn-hangzhou \
--instance-id i-xxxbash
cd <alibabacloud-alinux-sysom-inspection>
./scripts/init.sh
./scripts/osops.sh inspection \
--region-id cn-hangzhou \
--instance-id i-xxx执行逻辑
Execution Logic
- 每次执行巡检前先调用 ROA 接口 (
POST /api/v1/openapi/initial_sysom),用于判断用户是否具备权限且 SysOM 已开通。source=skill_hub - 若未开通或角色未就绪,命令会交互式询问是否继续“开通+安装 SysOM”。
- 用户同意后先调用 执行开通,再调用
InitialSysom(check_only=false, source=skill_hub)安装。InstallAgentWithType - 安装后会再次调用 复检,复检通过才继续巡检与诊断。
InitialSysom(check_only=true, source=skill_hub) - 不再本地配置阈值/事件规则,异常判断由服务端巡检报告决定。
- 固定调用 ROA 巡检接口:,并固定传
POST /api/v1/inspection/createInstanceInspection。source=skill_hub - 若需要巡检全部项目,可传 (CLI 中为显式传空
items=[])。--inspection-items - 若标准巡检 API 返回 ,CLI 会标记“当前版本不可用”并停止后续流程,避免无效重试。
InvalidAction.NotFound - 报告查询调用 ROA 接口:。
GET /api/v1/inspection/getInspectionReport - 当创建接口不可用时,CLI 会补发一次 探测调用并记录结果,确保日志中可观测到该动作。
GetInspectionReport - 巡检报告中若命中 异常,自动调用
sysom:metric:memory_usage_rate发起InvokeDiagnosis诊断。memgraph - 的
InvokeDiagnosis会注入params,并校验业务__sysom_diagnosis_source=skill_hub。code=Success - 发起诊断后自动轮询 ,直到
GetDiagnosisResult/success/ 超时。fail - 可通过 关闭自动诊断。
--disable-memgraph-diagnosis
- Before each inspection, call the ROA interface (with
POST /api/v1/openapi/initial_sysom) to determine whether the user has permission and SysOM has been activated.source=skill_hub - If it is not activated or the role is not ready, the command will interactively ask whether to proceed with "activation + installation of SysOM".
- After the user agrees, first call to perform activation, then call
InitialSysom(check_only=false, source=skill_hub)for installation.InstallAgentWithType - After installation, call again for re-inspection, and only proceed with inspection and diagnosis if the re-inspection passes.
InitialSysom(check_only=true, source=skill_hub) - No local threshold/event rules are configured; anomaly judgment is determined by the server-side inspection report.
- Fixed call to the ROA inspection interface: , with fixed parameter
POST /api/v1/inspection/createInstanceInspection.source=skill_hub - To inspect all items, pass (explicitly pass empty
items=[]in CLI).--inspection-items - If the standard inspection API returns , the CLI will mark "current version unavailable" and stop subsequent processes to avoid invalid retries.
InvalidAction.NotFound - Report query calls the ROA interface: .
GET /api/v1/inspection/getInspectionReport - When the creation interface is unavailable, the CLI will send an additional probe call and record the result to ensure the action is observable in the logs.
GetInspectionReport - If anomaly is detected in the inspection report, automatically call
sysom:metric:memory_usage_rateto initiateInvokeDiagnosisdiagnosis.memgraph - The of
paramswill injectInvokeDiagnosis, and verify the business__sysom_diagnosis_source=skill_hub.code=Success - After initiating diagnosis, automatically poll until
GetDiagnosisResult/success/ timeout.fail - Automatic diagnosis can be disabled via .
--disable-memgraph-diagnosis
可扩展性约定
Extensibility Conventions
- 巡检项可通过 传入覆盖默认列表。
--inspection-items - 若 InitialSysom 返回未开通,CLI 会在终端进行交互式确认后再执行开通尝试+重检。
- 内存异常触发诊断的判定逻辑位于 。
scripts/sysom_cli/inspection/command.py - 如需新增“巡检命中后触发的专项诊断”,可复用 调用方式扩展。
InvokeDiagnosis
- Inspection items can be passed via to override the default list.
--inspection-items - If InitialSysom returns not activated, the CLI will perform interactive confirmation in the terminal before attempting activation + re-inspection.
- The judgment logic for triggering diagnosis due to memory anomalies is located in .
scripts/sysom_cli/inspection/command.py - To add "special diagnosis triggered after inspection hits anomalies", you can reuse the call method for expansion.
InvokeDiagnosis