Loading...
Loading...
Remote command execution and file transfer on SageMaker HyperPod cluster nodes via AWS Systems Manager (SSM). This is the primary interface for accessing HyperPod nodes — direct SSH is not available. Use when any skill, workflow, or user request needs to execute commands on cluster nodes, upload files to nodes, read/download files from nodes, run diagnostics, install packages, or perform any operation requiring shell access to HyperPod instances. Other HyperPod skills depend on this skill for all node-level operations.
npx skill4agent add awslabs/agent-plugins hyperpod-ssmsagemaker-cluster:<CLUSTER_ID>_<GROUP_NAME>-<INSTANCE_ID>CLUSTER_IDget-cluster-info.shGROUP_NAMElist-nodes.shINSTANCE_IDi-0123456789abcdef0scripts/scripts/get-cluster-info.sh CLUSTER_NAME [--region REGION]
# Output: {"cluster_id":"...","cluster_arn":"...","cluster_name":"...","region":"..."}scripts/list-nodes.sh CLUSTER_NAME [--region REGION] [--instance-group GROUP] [--instance-id ID]
# Output: JSON array of ClusterNodeSummaries (InstanceId, InstanceGroupName, InstanceStatus, etc.)list-cluster-nodes# Execute — with pre-built target
scripts/ssm-exec.sh --target "sagemaker-cluster:CLUSTERID_GROUP-INSTANCEID" 'command' [--region REGION]
# Execute — with parts
scripts/ssm-exec.sh --cluster-id ID --group GROUP --instance-id INSTANCE_ID 'command' [--region REGION]
# Upload
scripts/ssm-exec.sh --target TARGET --upload LOCAL_PATH REMOTE_PATH [--region REGION]
# Read remote file
scripts/ssm-exec.sh --target TARGET --read REMOTE_PATH [--region REGION]start-sessionaws ssm send-commandsagemaker-cluster:start-sessionaws ssm start-sessionAWS-StartNonInteractiveCommandcat > /tmp/cmd.json << 'EOF'
{"command": ["bash -c 'echo hello && whoami'"]}
EOF
aws ssm start-session \
--target sagemaker-cluster:{CLUSTER_ID}_{GROUP_NAME}-{INSTANCE_ID} \
--region REGION \
--document-name AWS-StartNonInteractiveCommand \
--parameters file:///tmp/cmd.json--parameters| Task | Command |
|---|---|
| Lifecycle logs | |
| Memory | |
| Disk/mounts | |
| GPU status | |
| GPU memory | |
| EFA/network | |
| CloudWatch agent | |
| Top processes | |
root--document-nameAWS-StartNonInteractiveCommand