Please provide the following information when requesting support.
• Hardware (T4/V100/Xavier/Nano/etc) : T4
• Network Type : OCDNET
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
We have installed TAO toolkit 5.2 and can train model using the same. But our IT team found vulnerabilities in the server for older Pillow version.
One example is
But these Pillow versions coming from the docker containers; so how can we get them upgraded to 10.0.1 version?
docker run inside the docker, and update the Pillow, then
docker commit to save the custom docker.
Thanks for your reply. But wont that break any existing feature or functionality or dependencies of Tao toolkit?
You can have a try. I am afraid it will not since a minor version is updated.
We could find two docker images,
We updated Pillow inside the docker container (Tag: 5.2.0-pyt2.1.0) and made docker commit which created a new docker image with new tag. Now how can the tao commands initiate this new updated docker image?
Also we get multiple different pillow version at path ‘var/lib/docker/overlay2’ identified by IT team; will all those can get updated if we update the only nvcr docker image used for tao?
Also is there any NVIDIA document to manage this kind of Pillow version issue?
docker commit the new docker with new tag, you can run
docker run to trigger this custom docker.
Official way is mentioned in GitHub - NVIDIA/tao_pytorch_backend: TAO Toolkit deep learning networks with PyTorch backend or GitHub - NVIDIA/tao_tensorflow1_backend: TAO Toolkit deep learning networks with TensorFlow 1.x backend to build a new docker.
Thanks for your reply.
But I am unable to trigger the docker using docker run.
(base) azure_devops@training-1:~$ docker run --gpus all --rm --ipc=host --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 9ac9ab90565e
=== TAO Toolkit PyTorch ===
NVIDIA Release 5.2.0-PyT2.1.0 (build 69180607)
TAO Toolkit Version 5.2.0
Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the TAO Toolkit End User License Agreement.
By pulling and using the container, you accept the terms and conditions of this license:
Note: If I execute tao commands , it automatically starts the default docker container.
docker run --gpus all -it --rm --ipc=host --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 nvcr.io/nvidia/tao/tao-toolkit:5.2.0-pyt2.1.0 /bin/bash
Thanks for your reply. I could able to get inside the custom committed docker using docker run. Now when I run tao commands within the docker it does not recognise tao command. But tao commands run fine from terminal. How can I use the tao commands within the custom docker?
When you run inside the docker, it is not needed to add
tao model in the beginning.
You can just run something like
$ ocdnet xxx
Thanks a lot. I could train a model using ocdnet. But when I am using detectnet_v2 inside docker it can not recognise the command.
There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks
Different network may run on different docker. You can confirm it via
$ tao info --verbose.
For detectnet_v2, it can be
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.