Loading...
Loading...
Compare original and translation side by side
undefinedundefinedtao-skill-bank:tao-run-platformtao-skill-bank:tao-run-platformnetwork <name> not foundnetwork <name> not found
If a check fails, the agent prompts the user to authorize the install/fix via Bash before proceeding.
如果检查失败,代理会提示用户通过Bash授权安装/修复后再继续。tao_default$oauthtokennvcr.iotao_default$oauthtokennvcr.ios3://ACCESS_KEYSECRET_KEYaws s3 lsHF_TOKENs3://ACCESS_KEYSECRET_KEYaws s3 lsHF_TOKEN--gpus--gpus all--gpus '"device=0,1,2,3"'DockerSDK.create_job(gpu_count=N)--gpuslocalhosttorchrun --nproc-per-node=N--gpus--gpus all--gpus '"device=0,1,2,3"'DockerSDK.create_job(gpu_count=N)--gpuslocalhosttorchrun --nproc-per-node=Nlocal-docker{
"backend_type": "local-docker",
"num_gpu": 1
}BACKENDHOST_PLATFORMMONGOSECRETDOCKER_HOSTDOCKER_NETWORKlocal-docker{
"backend_type": "local-docker",
"num_gpu": 1
}BACKENDHOST_PLATFORMMONGOSECRETDOCKER_HOSTDOCKER_NETWORKtao-job-<job_id>["/bin/bash", "-c", "<job command>"]DOCKER_AUTO_REMOVE=true/dev/shmruntime="nvidia"NVIDIA_VISIBLE_DEVICESNVIDIA_DRIVER_CAPABILITIES=alldevice_requestsnum_gpus0num_gpus-1tao-job-<job_id>["/bin/bash", "-c", "<job command>"]DOCKER_AUTO_REMOVE=true/dev/shmruntime="nvidia"NVIDIA_VISIBLE_DEVICESNVIDIA_DRIVER_CAPABILITIES=alldevice_requestsnum_gpus0num_gpus-1file://lustre:///...file://lustre:///...docker logs tao-job-<job_id>docker logs tao-job-<job_id>script_runnerfrom tao_sdk.platforms.docker import DockerSDK
sdk = DockerSDK() # reads DOCKER_HOST, NGC_KEY, S3 creds from env
job = sdk.create_job(
image='nvcr.io/nvidia/tao/tao-toolkit:6.26.3-pyt',
command='dino train -e /tmp/spec.yaml',
gpu_count=1,
inputs={'/data/train.json': 's3://bucket/coco/train.json'},
outputs=['/results/'],
)
status = sdk.get_job_status(job.id)
logs = sdk.get_job_logs(job.id, tail=200)docker runJobscript_runnerinputsoutputsdocker runscript_runnerfrom tao_sdk.platforms.docker import DockerSDK
sdk = DockerSDK() # reads DOCKER_HOST, NGC_KEY, S3 creds from env
job = sdk.create_job(
image='nvcr.io/nvidia/tao/tao-toolkit:6.26.3-pyt',
command='dino train -e /tmp/spec.yaml',
gpu_count=1,
inputs={'/data/train.json': 's3://bucket/coco/train.json'},
outputs=['/results/'],
)
status = sdk.get_job_status(job.id)
logs = sdk.get_job_logs(job.id, tail=200)Jobdocker runscript_runnerinputsoutputsdocker runDOCKER_HOSTdocker run --gpus ...NGC_KEYnvcr.iodocker login nvcr.io -u '$oauthtoken'docker logs tao-job-<job_id>DOCKER_NETWORKDOCKER_HOSTdocker run --gpus ...nvcr.ioNGC_KEYdocker login nvcr.io -u '$oauthtoken'docker logs tao-job-<job_id>DOCKER_NETWORK