Mismatch in python environment on AWS EC2 image

Hello,

I’m having one hell of a time trying to get TAO toolkit working on an AWS EC2 image following the instruction at :

https://docs.nvidia.com/tao/tao-toolkit/text/running_in_cloud/running_tao_toolkit_on_aws.html

i’m at

5. Start an EC2 Virtual Machine Instance. For running TAO Toolkit , use the NVIDIA Deep Learning Amazon Machine Instance (AMI). To use this AMI, select the **AWS Marketplace** and search for the **NVIDIA Deep Learning AMI**.

**Note** The Amazon EC2 P3 and G4 instances are optimized for the NVIDIA Volta/Turing GPUs.

6. Select one of the Amazon EC2 P3 and G4 instance types according to your P3 and G4 instance types.

I have to select one from the following available list:

so my guess is to use:

Q : is this guess correct?

if i copy that ami across to my own account and run it has a version of python installed ( Python 3.8.10 ) that is not supported by the TAO toolkit (python >=3.6.9<3.7 )
Q: Is this correct ? what do i need to do to get this working ?

Small note: in step 2. Once you have logged in, select **Compute** under EC2.
there is no Compute option in the EC2 (anymore?)

Q: If i do use that AMI, what is already installed of the prerequisites and which one isn’t ?

Q: Is there a single docker image available that i can run on that AMI so all prerequisites are already installed ?

Refer to TAO Toolkit Quick Start Guide - NVIDIA Docs
Once you have installed miniconda , create a new environment by setting the Python version to 3.6.

Are there no single end to end steps described here what to do to get a working TAO-Toolkit ?
I tried to pause the video but some of those commands that were entered were on the screen for a few frames. even then i tried to follow them and failed.
Can the exact steps be added to a script or at least a list of the commands please ?

Even the TAO Toolkit Quick Start Guide - NVIDIA Docs

first has a list of Software Requirements:

Software Version ** Comment**
Ubuntu LTS 20.04
python >=3.6.9<3.7 Not needed if you use TAO toolkit API
docker-ce >19.03.5 Not needed if you use TAO toolkit API
docker-API 1.40 Not needed if you use TAO toolkit API
nvidia-container-toolkit >1.3.0-1 Not needed if you use TAO toolkit API
nvidia-container-runtime 3.4.0-1 Not needed if you use TAO toolkit API
nvidia-docker2 2.5.0-1 Not needed if you use TAO toolkit API
nvidia-driver >520 Not needed if you use TAO toolkit API
python-pip >21.06 Not needed if you use TAO toolkit API

i don’t see miniconda in the Software Requirements ?
is this a conditional requirement if python is not the right version ?

do i need to manually find where to install these from or should i ignore the Software Requirements table and follow the “Getting Started” instructions

wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/tao/tao-getting-started/versions/5.0.0/zip -O getting_started_v5.0.0.zip
unzip -u getting_started_v5.0.0.zip  -d ./getting_started_v5.0.0 && rm -rf getting_started_v5.0.0.zip && cd ./getting_started_v5.0.0

and then run the setup/quickstart_launcher.sh in that ./getting_started_v5.0.0 directory ?

or both ? And how does the setup/quickstart_launcher.sh know which of the

|–> quickstart_api_bare_metal
|–> quickstart_api_aws_eks
|–> quickstart_api_azure_aks
|–> quickstart_api_gcp_gke

to run ?

Yes, it is conditional requirement.

For installing miniconda, please refer to TAO Toolkit Quick Start Guide - NVIDIA Docs

NVIDIA recommends setting up a python environment using miniconda. The following instructions show how to setup a python conda environment.

  1. Follow the instructions in this link to set up a conda environment using a miniconda.
  2. Once you have installed miniconda, create a new environment by setting the Python version to 3.6.
    conda create -n launcher python=3.6
  3. Activate the conda environment that you have just created.
    conda activate launcher
  4. Once you have activated your conda environment, the command prompt should show the name of your conda environment.
    (launcher) py-3.6.9 desktop:
  5. When you are done with you session, you may deactivate your conda environment using the deactivate command:
    conda deactivate
  6. You may re-instantiate this created conda environment using the following command.
    conda activate launcher

Please check the Software Requirements. To check

  • OS version (Ubuntu 18.04 or 20.04.)
  • If docker is already available(usually yes in user’s machine).
  • nvidia-driver version. For TAO 5.0, you can install 525. ( sudo apt install nvidia-driver-525)
  • If nvidia-docker2 is available. If not, please run sudo apt-get install nvidia-docker2 and sudo systemctl restart docker.service.

Other tip: New computer install GPU Docker error - #6 by david9xqqb

For setup/quickstart_launcher.sh, it is going to install tao launcher only.
Then you can run tao info and tao ssd , etc.

Thank you, I appreciate your help a lot !!

would it be easier for everyone to have all these instructions in 1 location in the documentation that people can follow step by step so people don’t have to ask ?

I’m a bit confused whether to install miniconda or not. it’s not in the requirements, but i guess because the particular NVIdia AMI has a newer version of python installed i should ?

So the software requirement for the nvidia-driver is 525 if installing TAO 5.0. Are there more of these dependent versions ? So is it correct that that particular AMI that i found in the AWS Marketplace has the wrong version of nvidia-driver installed ?

How does everyone else get TAO Toolkit 5.0 running ? I have a feeling i’m missing something (might be some brain cells on my part)

Please refer to TAO Toolkit Quick Start Guide - NVIDIA Docs. It provides the guideline. Also, it mentions that a python environment using miniconda is recommended when python version >= 3.6.9.

For TAO 5.0, the nvidia-smi result is expected to >520. You can check with $nvidia-smi . Actually when it is lower than 520, some networks can also work without any issue. Suggesting >520 is in order to make sure every networks can work.

1 Like