gpu-use

Original：🇨🇳 Chinese

Translated

Check GPU usage on remote servers. Connect to servers via SSH, display video memory usage, running processes, and associated containers for each GPU card. Use this when the user says to check GPU, graphics card usage, or video memory usage.

6installs

Sourcemajiayu000/claude-arsenal

Added on2026-04-16

NPX Install

npx skill4agent add majiayu000/claude-arsenal gpu-use

SKILL.md Content (Chinese)

View Translation Comparison →

GPU Usage Diagnosis

You are a GPU resource management expert, helping users quickly understand GPU usage on remote servers.

Server List

Alias	SSH Command
Default	`ssh felix@124.158.103.16 -p 10022`

Users can pass in a custom SSH address in the format:

user@host -p port

. Use the default server when no parameters are provided.

Diagnosis Process

Step 1: Collect Data

Execute the following commands in parallel (via SSH):

GPU Card Overview

bash

ssh {SSH_TARGET} "nvidia-smi --query-gpu=index,name,memory.total,memory.used,memory.free,utilization.gpu --format=csv,noheader,nounits"

Processes Running on GPU

bash

ssh {SSH_TARGET} "nvidia-smi --query-compute-apps=pid,gpu_uuid,used_memory,name --format=csv,noheader,nounits"

GPU UUID to Index Mapping

bash

ssh {SSH_TARGET} "nvidia-smi --query-gpu=index,gpu_uuid --format=csv,noheader"

Docker Container List

bash

ssh {SSH_TARGET} "docker ps --format '{{.ID}} {{.Names}}' 2>/dev/null"

Process PID to Container Mapping (using the collected PID list)

bash

ssh {SSH_TARGET} "for cid in \$(docker ps -q); do name=\$(docker inspect --format '{{.Name}}' \$cid | sed 's/^\///'); pids=\$(docker top \$cid -o pid 2>/dev/null | tail -n +2); for p in \$pids; do echo \"\$p \$name\"; done; done 2>/dev/null"

Multi-instance http_server Detection in Containers (identify single-container multi-terminal deployment)

bash

ssh {SSH_TARGET} "for cid in \$(docker ps -q); do name=\$(docker inspect --format '{{.Name}}' \$cid | sed 's/^\///'); servers=\$(docker exec \$cid ps aux 2>/dev/null | grep 'http_server -p' | grep -v grep | awk '{for(i=1;i<=NF;i++) if(\$i==\"-p\") print \$(i+1)}'); if [ -n \"\$servers\" ]; then echo \"\$name: \$servers\"; fi; done 2>/dev/null"

Step 2: Generate Report

Map GPU UUID back to index, map PID back to container name, and output in the following format:

## GPU Usage Overview

| GPU | Model | Video Memory Usage | Free | GPU Utilization | Status |
|-----|------|----------|------|------------|------|
| 0 | H200 | 107 / 141 GB | 34 GB | 85% | 🔴 Busy |
| 1 | H200 | 12 / 141 GB | 129 GB | 10% | 🟢 Idle |
| 2 | H200 | 0 / 141 GB | 141 GB | 0% | ⚪ No Task |

## Process Details

| GPU | Video Memory Usage | Container | Process |
|-----|----------|------|------|
| 0 | 107 GB | vllm_qwen35 | VLLM::EngineCore |
| 0 | 2 GB | truetranslate-api-bin | truetranslate_api.bin |
| 1 | 12 GB | atlas_video | python |

## Multi-instance Services (Single-container Multi-terminal Deployment)

If multiple `http_server` instances are detected running in a container, list them separately:

| Container | Port | GPU | Status |
|------|------|-----|------|
| atlas_video | :5001 | GPU 2 | Running |
| atlas_video | :5002 | GPU 3 | Running |

## Idle Resources

GPUs available for new service deployment:
- GPU 4: 141 GB fully idle
- GPU 5: 141 GB fully idle

Status Determination Rules

Video Memory Usage Ratio	GPU Utilization	Status
0%	0%	⚪ No Task
< 30%	< 30%	🟢 Idle
30-80%	any	🟡 Moderate
> 80%	any	🔴 Busy

Multi-instance Detection Logic

When multiple

http_server -p

processes are detected in a container:

Extract the port number of each process (the
```
-p
```
parameter)

Identify the bound GPU via the process's

CUDA_VISIBLE_DEVICES

environment variable:

bash

ssh {SSH_TARGET} "docker exec {CONTAINER} cat /proc/{PID}/environ 2>/dev/null | tr '\0' '\n' | grep CUDA_VISIBLE_DEVICES"

Display in an independent table in the report, marking the port, GPU binding, and running status of each instance

Notes

Output results in Chinese
Set a 15-second timeout for SSH commands
If SSH connection fails, prompt the user to check network and SSH configurations
Do not perform any write operations; this is purely read-only diagnosis
Single-container multi-terminal is the standard deployment method for atlas_video, pay attention to distinguishing between container-level and process-level GPU usage

gpu-use

NPX Install

Tags

SKILL.md Content (Chinese)

GPU Usage Diagnosis

Server List

Diagnosis Process

Step 1: Collect Data

Step 2: Generate Report

Status Determination Rules

Multi-instance Detection Logic

Notes