Error in Create the tfrecords using the bpnet dataset_convert tool

Please provide the following information when requesting support.

• Hardware Orin Nano Development Kit
• Network Type Error not related to the network type yet
• TLT Version tlt command not found
• Training spec file(If have, please share here)
• How to reproduce the issue ? (

I’m running the bpnet Juypter notebook in Tao_launcher_starter_kit. Things are going fine until I get to the cell that does " Create the tfrecords using the bpnet dataset_convert tool". This is exactly the same error that is in the forum on April 12 2022 but it was closed without a resolution that I can understand.

It said that Morganh solved the problem with "Please trigger the tao docker in dgpu, such as V100, A100, T4, etc. Jetson devices is not expected to trigger tao docker. " The person asking didn’t understand that solution, and I don’t either. I would think that the sequence of actions in the notebook would handle that issue. But of course I will follow whatever instructions I need to to get past this. How does one trigger the tao docker in dgpu in the context of this notebook, if that is also my solution? Thanks in advance.

Here is the printout, the same as in the previous forum entry:
2023-12-04 17:47:44,422 [TAO Toolkit] [INFO] root 160: Registry: [‘nvcr.io’]
2023-12-04 17:47:44,798 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
2023-12-04 17:47:44,844 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 275: Printing tty value True
Docker instantiation failed with error: 500 Server Error: Internal Server Error (“failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as ‘csv’
invoking the NVIDIA Container Runtime Hook directly (e.g. specifying the docker --gpus flag) is not supported. Please use the NVIDIA Container Runtime (e.g. specify the --runtime=nvidia flag) instead.: unknown”)

You are running in Jetson Orin Nano. For tao docker, it is expected to trigger in dgpu machines.

Thanks for your very quick response! So what should I do to fix this?

Do you have any local dgpu machine? For training, user can run trigger TAO docker in local machine or Cloud machines(https://docs.nvidia.com/tao/tao-toolkit/text/running_in_cloud/running_tao_toolkit_on_aws.html).
For deployment, user can run in dgpu machines, Cloud machines or Jetson devices.

I bought the Orin Nano to do this development. Are you saying that I can’t use Tao with it?

As mentioned above, if you are going to run training with TAO, dgpu machine or cloud machine is needed. You can run with cloud machine if you have not dgpu machine.
If you are just going to run inference, you can run with any machine.

Thanks for your patience. I need to clarify requirements, since tI’ve invested in the Orin Nano, and the Tao quickstart guide says the following requirements:

Minimum System Configuration- 8 GB system RAM

  • 4 GB of GPU RAM

  • 8 core CPU

  • 1 NVIDIA GPU

  • 100 GB of SSD space

  • TAO Toolkit is not supported on GPU’s before the Pascal generation.

It looks like the Orin Nano meets those requirements. Isn’t that correct? Thanks again.

For training, TAO Toolkit is supported on discrete GPUs, such as H100, A100, A40, A30, A2, A16, A100x, A30x, V100, T4, Titan-RTX, and Quadro-RTX, etc.

TAO docker is designed to run on x86 systems with discrete GPUs. It was mentioned in an old document. NVIDIA TAO - NVIDIA Docs
Overview - NVIDIA Docs

The TAO docker cannot support running on arm-based devices yet.

Since Jetson devices are arm based, so the tao docker in TAO Toolkit | NVIDIA NGC do not support running on them.

But for deployment, users can run in any devices. In Jetson devices, you can generate engine and run inference with it. More can be found in TAO Deploy Installation - NVIDIA Docs,
GitHub - NVIDIA/tao_deploy: Package for deploying deep learning models from TAO Toolkit and Tao-deploy on Orin AGX CLI Error - #14.

Thanks for being perfectly clear on this. It does seem to me that the current Tao “getting started” description is misleading, since it does not describe those requirements, rather it gives requirements that are met by the Jetson Orin Nano. There is no suggestion that any of the Tao functions are exceptions to that. I very much appreciate your quick and patient responses to my questions!

Thanks for the catching. We will improve the document.