Loading...
Loading...
Rent, manage, and destroy GPU instances on vast.ai. Use when user says "rent gpu", "vast.ai", "rent a server", "cloud gpu", or needs on-demand GPU without owning hardware.
npx skill4agent add wanshuiyin/auto-claude-code-research-in-sleep vast-gpuvastaipip install vastai
vastai set api-key YOUR_API_KEYIf your system Python is < 3.10, create a virtual environment with Python ≥ 3.10 (e.g.,,conda create,pyenv, etc.) and installuv venvthere.vastai
vast-instances.json[
{
"instance_id": 33799165,
"offer_id": 25831376,
"gpu_name": "RTX_3060",
"num_gpus": 1,
"dph": 0.0414,
"ssh_url": "ssh://root@1.208.108.242:58955",
"ssh_host": "1.208.108.242",
"ssh_port": 58955,
"created_at": "2026-03-29T21:12:00Z",
"status": "running",
"experiment": "exp01_baseline",
"estimated_hours": 4.0,
"estimated_cost": 0.17
}
]/run-experiment/monitor-experiment/run-experimentgpu: vastrefine-logs/EXPERIMENT_PLAN.mdnum_parametersDataParallelDistributedDataParallelacceleratedeepspeed| Factor | How to estimate |
|---|---|
| Min VRAM | Model params × 4 bytes (fp32) or × 2 (fp16/bf16) + optimizer states + activations. Rules of thumb: 7B model ≈ 16 GB (fp16), 13B ≈ 28 GB, 70B ≈ 140 GB (needs multi-GPU). ResNet/ViT ≈ 4-8 GB. Add 20% headroom. |
| Num GPUs | 1 unless: model doesn't fit in single GPU VRAM, or scripts use DDP/FSDP/DeepSpeed, or plan specifies multi-GPU |
| Est. hours | From experiment plan's cost column, or: (dataset_size × epochs) / (throughput × batch_size). Default to user estimate if available. Add 30% buffer for setup + unexpected slowdowns |
| Min disk | 20 GB base + model checkpoint size + dataset size. Default: 50 GB |
| CUDA version | Match PyTorch version. PyTorch 2.x needs CUDA ≥ 11.8. Default: 12.1 |
# Tier 1: Budget GPUs (good for small models, fine-tuning, ablations)
vastai search offers "gpu_ram>=<MIN_VRAM> num_gpus>=<N> reliability>0.95 inet_down>100" -o 'dph+' --storage <DISK> --limit 10
# Tier 2: If VRAM > 24 GB, also search high-VRAM cards specifically
vastai search offers "gpu_ram>=48 num_gpus>=<N> reliability>0.95" -o 'dph+' --storage <DISK> --limit 5IDCUDANModelPCIEcpu_ghzvCPUsRAMDisk$/hrDLPscoreNV DriverNet_upNet_downRMax_Daysmach_idstatushost_idportscountryIDvastai create instanceTask analysis:
- Model: [model name/size] → estimated VRAM: ~[X] GB
- Training: ~[Y] hours estimated
- Requirements: [N] GPU(s), ≥[X] GB VRAM, ~[Z] GB disk
Recommended options (sorted by estimated total cost):
| # | GPU | VRAM | $/hr | Est. Hours | Est. Total | Reliability | Offer ID |
|---|-------------|-------|--------|------------|------------|-------------|-----------|
| 1 | RTX 3060 | 12 GB | $0.04 | ~6h | ~$0.25 | 99.4% | 25831376 | ← cheapest
| 2 | RTX 4090 | 24 GB | $0.28 | ~4h | ~$1.12 | 99.2% | 6995713 | ← best value
| 3 | A100 SXM | 80 GB | $0.95 | ~2h | ~$1.90 | 99.5% | 7023456 | ← fastest
Option 1 is cheapest overall. Option 3 finishes fastest.
Pick a number (or type a different offer ID):| GPU | Relative Speed (FP16) |
|---|---|
| RTX 3060 | 0.5× |
| RTX 3090 | 1.0× |
| RTX 4090 | 1.6× |
| A5000 | 0.9× |
| A6000 | 1.1× |
| L40S | 1.5× |
| A100 SXM | 2.0× |
| H100 SXM | 3.3× |
vastai create instance <OFFER_ID> \
--image <DOCKER_IMAGE> \
--disk <DISK_GB> \
--ssh \
--direct \
--onstart-cmd "apt-get update && apt-get install -y git screen rsync"pytorch/pytorch:2.1.0-cuda12.1-cudnn8-develCLAUDE.mdimage:Started. {'success': True, 'new_contract': 33799165, 'instance_api_key': '...'}new_contractvastai show instances --raw | python3 -c "
import sys, json
instances = json.load(sys.stdin)
for inst in instances:
if inst['id'] == <INSTANCE_ID>:
print(inst['actual_status'])
"loadingrunningloadingvastai ssh-url <INSTANCE_ID>ssh://root@<HOST>:<PORT>ssh://root@1.208.108.242:589551.208.108.24258955Important: Always useto get connection details — do NOT rely onvastai ssh-url/ssh_hostfromssh_port, as those may point to proxy servers that differ from the direct connection endpoint.vastai show instances
ssh -o StrictHostKeyChecking=no -o ConnectTimeout=15 -p <PORT> root@<HOST> "nvidia-smi && echo 'CONNECTION_OK'"vast-instances.jsonssh_urlVast.ai instance ready:
- Instance ID: <ID>
- GPU: <GPU_NAME> x <NUM_GPUS>
- Cost: $<DPH>/hr (estimated total: ~$<TOTAL>)
- SSH: ssh -p <PORT> root@<HOST>
- Docker: <IMAGE>
To deploy: /run-experiment (will auto-detect this instance)
To destroy when done: /vast-gpu destroy <ID>/run-experimentssh -p <PORT> root@<HOST> "pip install -q wandb tensorboard scipy scikit-learn pandas"requirements.txtscp -P <PORT> requirements.txt root@<HOST>:/workspace/
ssh -p <PORT> root@<HOST> "pip install -q -r /workspace/requirements.txt"Note:uses uppercasescpfor port, while-Puses lowercasessh.-p
rsync -avz -e "ssh -p <PORT>" \
--include='*.py' --include='*.yaml' --include='*.yml' --include='*.json' \
--include='*.txt' --include='*.sh' --include='*/' \
--exclude='*.pt' --exclude='*.pth' --exclude='*.ckpt' \
--exclude='__pycache__' --exclude='.git' --exclude='data/' \
--exclude='wandb/' --exclude='outputs/' \
./ root@<HOST>:/workspace/project/ssh -p <PORT> root@<HOST> "cd /workspace/project && python -c 'import torch; print(f\"PyTorch {torch.__version__}, CUDA: {torch.cuda.is_available()}, GPUs: {torch.cuda.device_count()}\")'"PyTorch 2.1.0, CUDA: True, GPUs: 1ssh -p <PORT> root@<HOST> "ls /workspace/project/results/ 2>/dev/null || echo 'NO_RESULTS_DIR'"rsync -avz -e "ssh -p <PORT>" root@<HOST>:/workspace/project/results/ ./results/scp -P <PORT> root@<HOST>:/workspace/*.log ./logs/ 2>/dev/nullvastai destroy instance <INSTANCE_ID>destroying instance <INSTANCE_ID>.Destruction is irreversible — all data on the instance is permanently deleted.
vast-instances.jsondestroyedInstance <ID> destroyed.
- Duration: ~X.X hours
- Actual cost: ~$X.XX (estimated was $Y.YY)
- Results downloaded to: ./results/vastai show instancesvast-instances.jsonvast-instances.json--directvastai ssh-url <ID>show instancespytorch/pytorch:2.1.0-cuda12.1-cudnn8-devel/workspace//workspace/project/vast-instances.jsonvastaigpu: vast## Vast.ai
- gpu: vast # tells run-experiment to use vast.ai
- auto_destroy: true # auto-destroy after experiment completes (default: true)
- max_budget: 5.00 # optional: max total $ to spend (skill warns if estimate exceeds this)
- image: pytorch/pytorch:2.1.0-cuda12.1-cudnn8-devel # optional: override Docker image/run-experiment "train model" ← detects gpu: vast, calls /vast-gpu provision
↳ /vast-gpu provision ← analyzes task, presents options with cost
↳ user picks option ← rent + setup + deploy
↳ /vast-gpu destroy ← auto-destroy when done (if auto_destroy: true)
/vast-gpu provision ← manual: analyze task + show options
/vast-gpu rent <offer_id> ← manual: rent a specific offer
/vast-gpu list ← show active instances
/vast-gpu destroy <instance_id> ← tear down, stop billing
/vast-gpu destroy-all ← tear down everything