Loading...
Loading...
Help me troubleshoot service issues based on Prometheus metrics
npx skill4agent add xiaomi/mone prometheus-skillprometheus.pyuv run prometheus.py 'sum(rate(container_cpu_user_seconds_total{image!="",application="xxx_gis"}[30s])) by(application) * 100'Example Result: "Latest value queried: 32.3487092130255" This means the CPU usage is around 32% (the maximum is not 100%, but a percentage of the requested cores; for example, if 3 cores are requested, the maximum is 300%).
prometheus.pyuv run prometheus.py 'sum((container_spec_cpu_quota{application="xxx_gis", image!=""}) /1000) by (application)'Example Result: "Latest value queried: 6181" This means the service has requested a total of 6181% CPU.
prometheus.pyuv run prometheus.py 'sum(container_memory_rss{image!="",application="xxx_gis"}) by (application)'Example Result: "Latest value queried: 21611335680.0" (unit: byte)
prometheus.pyuv run prometheus.py 'sum(container_spec_memory_limit_bytes{image!="",application="xxx_gis"}) by (application)'Example Result: "Latest value queried: 75161927680.0" (unit: byte)
prometheus.pyuv run prometheus.py 'sum(container_cpu_load_average_10s{application="xxx_gis"}) by (application)'Example Result: "Latest value queried: 1242.0", note that this value should be compared with the CPU request amount. Theoretically, it is optimal if it is less than 80% of the request amount. For example, if the CPU request amount is 2000, 1242 is not high.
prometheus.pyuv run prometheus.py 'sum(jvm_memory_used_bytes{application="xxx_gis"}) by (application)'Example Result: "Latest value queried: 15085025864.0" (unit: byte)
prometheus.pyuv run prometheus.py 'sum(jvm_memory_max_bytes{application="xxx_gis"}) by (application)'Example Result: "Latest value queried: 100503912448.0" (unit: byte)