Docker instantiation failed with error

Generate TFRecords for training dataset

!tao model bpnet dataset_convert
-m ‘train’
-o $DATA_DIR/train
-r $USER_EXPERIMENT_DIR/
–generate_masks
–dataset_spec $DATA_POSE_SPECS_DIR/coco_spec.json

When I was run this command on jupyter ,that error was occured.

2023-09-04 04:29:43,003 [TAO Toolkit] [INFO] root 160: Registry: [‘nvcr.io’]
2023-09-04 04:29:43,569 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 361: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
2023-09-04 04:29:43,649 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 275: Printing tty value True
Docker instantiation failed with error: 500 Server Error: Internal Server Error (“failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as ‘csv’
invoking the NVIDIA Container Runtime Hook directly (e.g. specifying the docker --gpus flag) is not supported. Please use the NVIDIA Container Runtime (e.g. specify the --runtime=nvidia flag) instead.: unknown”)

Please teach me how to fix this error
I dawnloaded notebooks from here
https://docs-nvidia-com.translate.goog/tao/tao-toolkit/text/tao_toolkit_quick_start_guide.html?_x_tr_sl=auto&_x_tr_tl=ja&_x_tr_hl=ja&_x_tr_pto=wapp#getting-started

• Hardware (Xavier)
• Network Type (bpnet)
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

Please trigger the docker via dgpu instead.

Sorry I dont have enough knowledge to understand your reply.
Please tell me in detail.

Please trigger the tao docker in dgpu, such as V100, A100, T4, etc.
Jetson devices is not expected to trigger tao docker.

That mean is I can’t run this notebook only my jetson device?

Correct. You can run TAO training with local dgpu machine or cloud dgpu machines.
For running inference, dgpu machines and Jetson devices are all working.

I don’t have local dgpu machine.Does Nvidia offers cloud dgpu machines?

https://docs-nvidia-com.translate.goog/metropolis/deepstream/dev-guide/text/DS_docker_containers.html?_x_tr_sl=en&_x_tr_tl=ja&_x_tr_hl=ja&_x_tr_pto=sc
This web site tell me how to use cloud dgpu machines?

Please refer to Running TAO Toolkit in the Cloud - NVIDIA Docs

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.