Please Help! Problems accessing tao toolkit with docker Jetson

rwakker · October 26, 2022, 6:28pm

Please provide the following information when requesting support.

• Jetson Xavier AGX

• BPNET

Configuration of the TAO Toolkit Instance

dockers:
nvidia/tao/tao-toolkit-tf:
v3.22.05-tf1.15.5-py3:
docker_registry: nvcr.io
tasks:
1. augment
2. bpnet
3. classification
4. dssd
5. faster_rcnn
6. emotionnet
7. efficientdet
8. fpenet
9. gazenet
10. gesturenet
11. heartratenet
12. lprnet
13. mask_rcnn
14. multitask_classification
15. retinanet
16. ssd
17. unet
18. yolo_v3
19. yolo_v4
20. yolo_v4_tiny
21. converter
v3.22.05-tf1.15.4-py3:
docker_registry: nvcr.io
tasks:
1. detectnet_v2
nvidia/tao/tao-toolkit-pyt:
v3.22.05-py3:
docker_registry: nvcr.io
tasks:
1. speech_to_text
2. speech_to_text_citrinet
3. speech_to_text_conformer
4. action_recognition
5. pointpillars
6. pose_classification
7. spectro_gen
8. vocoder
v3.21.11-py3:
docker_registry: nvcr.io
tasks:
1. text_classification
2. question_answering
3. token_classification
4. intent_slot_classification
5. punctuation_and_capitalization
nvidia/tao/tao-toolkit-lm:
v3.22.05-py3:
docker_registry: nvcr.io
tasks:
1. n_gram
format_version: 2.0
toolkit_version: 3.22.05
published_date: 05/25/2022

"Issue:

(launcher) jetson@ubuntu:~$ tao bpnet --help
2022-10-26 20:24:22,096 [INFO] root: Registry: [‘nvcr.io’]
2022-10-26 20:24:22,337 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.5-py3
Docker instantiation failed with error: 500 Server Error: Internal Server Error (“failed to create shim: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as ‘csv’
invoking the NVIDIA Container Runtime Hook directly (e.g. specifying the docker --gpus flag) is not supported. Please use the NVIDIA Container Runtime instead.: unknown”)

I stuck here for two weeks now, installed Xavier 3 times with SDK from local host.

Who can give me a hint?

thx!

dusty_nv · October 26, 2022, 11:22pm

Hi @rwakker, the tao-toolkit container is for x86. The model training occurs on x86 using TAO, and you can deploy the models trained with TAO to Jetson using DeepStream or Triton Inference Server. For DeepStream on Jetson, you can run the deepstream-l4t container:

Morganh · October 27, 2022, 2:52am

The TAO is designed to run on x86 systems with an NVIDIA GPU (e.g., GPU-powered workstation, DGX system, etc) or can be run in any cloud with an NVIDIA GPU. For inference, models can be deployed on any edge device such as an embedded Jetson platform or in a data center with GPUs like T4 or A100, etc.

rwakker · October 31, 2022, 10:43pm

hi, thx very much for the reply. Im not there yet :(

I like to start learning human pose estimations on swimmers with limited background in AI, train models etc…

Swimmer example:

Im using this hardware:

I started to train according following example:

But finally stuck in the training part of the provided jupyter notebook

!tao bpnet train -e $SPECS_DIR/bpnet_train_m1_coco.yaml
-r $USER_EXPERIMENT_DIR/models/exp_m1_unpruned
-k $KEY
–gpus $NUM_GPUS

Giving error:
2022-10-31 20:32:00,786 [INFO] root: Registry: [‘nvcr.io’]
2022-10-31 20:32:00,989 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.5-py3
Docker instantiation failed with error: 500 Server Error: Internal Server Error (“failed to create shim: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as ‘csv’
invoking the NVIDIA Container Runtime Hook directly (e.g. specifying the docker --gpus flag) is not supported. Please use the NVIDIA Container Runtime instead.: unknown”)

I really like to train my own images, swimmers, where I like to annotate based on keypoints of the partly visible images (swimmers are only seen from top, front view or side view) Arms are lifted up above the water and not visible in the image, etc…

I gave up and now I tried:

github.com

NVIDIA-AI-IOT/deepstream_reference_apps/blob/master/deepstream-bodypose-3d/README.md

# 3d-bodypose-deepstream

## Introduction
The project contains 3D Body Pose application built using  Deepstream SDK.

This application is built for [KAMA: 3D Keypoint Aware Body Mesh Articulation](https://arxiv.org/abs/2104.13502).
![sample pose output](./sources/.screenshot.png)
## Prerequisites:
DeepStream SDK 6.2 installed which is available at  http://developer.nvidia.com/deepstream-sdk
Please follow instructions in the `/opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-app/README` on how
to install the prequisites for building Deepstream SDK apps.

## Installation
Follow https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_Quickstart.html to setup the DeepStream SDK

1. Preferably clone the app in
  `/opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/`
and define project home as `export BODYPOSE3D_HOME=<parent-path>/3d-bodypose-deepstream`.

2. Install [NGC CLI](https://ngc.nvidia.com/setup/installers/cli) and download [PeopleNet](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/peoplenet) and [BodyPose3DNet](https://ngc.nvidia.com/models/nvstaging:tao:bodypose3dnet) from NGC.

This file has been truncated. show original

This works partly, I need a method to train my data without in-depth knowledge about setting up the complete pipeline,

Any suggestion to continue my research?

Morganh · November 1, 2022, 3:35am

Which device did you run the training? In dgpu or Jetson device?

rwakker · November 3, 2022, 7:02pm

I think

Configuration of the TAO Toolkit Instance
dockers: [‘nvidia/tao/tao-toolkit-tf’, ‘nvidia/tao/tao-toolkit-pyt’, ‘nvidia/tao/tao-toolkit-lm’]
format_version: 2.0
toolkit_version: 3.22.05
published_date: 05/25/2022
I managed to continue a big step, we can close the topic, it was about the l4t deepstream docker

Morganh · November 4, 2022, 2:17am

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one.
Thanks

OK. For training, please run TAO training on x86 systems with dgpu. Cannot run training in Jetson device.
For inference, dgpu or Jetson device is fine.

system · November 22, 2022, 6:15am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Docker instantiation fails when running "tao detectnet_v2" on Xavier NX Jetson AGX Xavier docker	5	556	October 5, 2022
Integrating Tao Models (detectnet_v2) into Deepstream SDK TAO Toolkit tao , deepstream , jetson-nano	11	978	March 24, 2023
How to run bpnet in tao toolkit? TAO Toolkit	51	2000	August 23, 2022
Error loading custom TAO model into PoseEstimation3D TAO Toolkit	8	891	March 15, 2023
Problem in "Get an NGC account and API key" TAO Toolkit jetson	20	148	November 25, 2024
Tao model error TAO Toolkit	9	122	October 21, 2024
LPRNet technical blog post: sample DeepStream app repo not found? TAO Toolkit jetson	20	132	September 3, 2024
Training a model (with TAO?) using deepstream SDK 6.4 docker TAO Toolkit	5	420	January 29, 2024
Problems with deepstream docker on Jetson nano DeepStream SDK	13	1035	October 12, 2021
Deepstream 6.0 docker on jetson nano clean install 20220109 not working! DeepStream SDK	6	876	January 26, 2022

Please Help! Problems accessing tao toolkit with docker Jetson

Related topics