NGC failed to download the pre-trained weights in tlt model training

senbhaskar26 · January 7, 2021, 8:22am

I am trying to implement a dockerized version of the transfer learning toolkit where I pull the NGC Nvidia docker into my own docker env and try to run the training in the form of a .py script.
Command to pull NGC docker :
FROM nvcr.io/nvidia/tlt-streamanalytics:v2.0_py3
But I am not able to download the pre-trained weights model inside the directory which I can use for further training steps.

I have converted the jupyter notebook used for trainig into a python script(which I am attaching for your reference).
detectnet_v2.py (26.0 KB)

getting error like :

Any help will be really appreciable.
Thanks

Morganh · January 7, 2021, 9:01am

There is ngc bin file inside 2.0_py3 docker.
Please see details below.

$ docker run --runtime=nvidia -it -v ~/demo_2.0:/workspace/tlt-experiments -p 8888:8888 nvcr.io/nvidia/tlt-streamanalytics:v2.0_py3
–2021-01-07 09:00:18-- https://ngc.nvidia.com/downloads/ngccli_reg_linux.zip
Resolving ngc.nvidia.com (ngc.nvidia.com)… 13.225.97.79, 13.225.97.113, 13.225.97.13, …
Connecting to ngc.nvidia.com (ngc.nvidia.com)|13.225.97.79|:443… connected.
HTTP request sent, awaiting response… 200 OK
Length: 20110328 (19M) [application/zip]
Saving to: ‘/opt/ngccli/ngccli_reg_linux.zip’

ngccli_reg_linux.zip 100%[====================================================================================================>] 19.18M 23.3MB/s in 0.8s

2021-01-07 09:00:19 (23.3 MB/s) - ‘/opt/ngccli/ngccli_reg_linux.zip’ saved [20110328/20110328]

Archive: /opt/ngccli/ngccli_reg_linux.zip
inflating: /opt/ngccli/ngc
extracting: /opt/ngccli/ngc.md5
root@02c4f89b270d:/workspace# which ngc
/opt/ngccli/ngc
root@02c4f89b270d:/workspace# ngc --version
NGC Registry CLI 1.24.0

senbhaskar26 · January 8, 2021, 5:43am

his Morganh … thanks for your reply. so what you told is happening correctly when I pulled the docker and start training on Jupiter notebook like this.

But in my case I am creating my own docker and inside docker I am pulling tlt-stream like this:
FROM nvcr.io/nvidia/tlt-streamanalytics:v2.0_py3
and calling all the files inside the docker and running that docker.
At that time I am not getting pre-trained models.
Any solution for this really appreciable.
thanks

Morganh · January 8, 2021, 6:28am

According to your original attached screenshot, seems that the ngc is not found.

/bin/sh: 1:ngc: not found

See the log in my previous comment.
You can try to download the ngc tool when you generate your own docker.

$ wget https://ngc.nvidia.com/downloads/ngccli_reg_linux.zip

senbhaskar26 · January 8, 2021, 7:23am

@Morganh hii
I have done that but still getting the error:
/bin/sh: 1:ngc:not found.
I am sharing the logs

Morganh · January 8, 2021, 7:29am

Reference:
mkdir -p /opt/ngccli &&
wget “https://ngc.nvidia.com/downloads/ngccli_reg_linux.zip” -P /opt/ngccli &&
unzip -u “/opt/ngccli/ngccli_reg_linux.zip” -d /opt/ngccli/ &&
rm /opt/ngccli/*.zip &&
chmod u+x /opt/ngccli/ngc

senbhaskar26 · January 8, 2021, 7:47am

@Morganh hii
if I am not wrong

this is how my docker looks.

my terminal.
at the time of unzipping getting this error?
thanks

Morganh · January 8, 2021, 7:52am

Please change

 “/opt/ngccli/ngccli_reg_linux.zip”

to

 "/opt/ngccli/ngccli_reg_linux.zip"

One tip, please verify in your local pc before writing the command in the dockerfile.

senbhaskar26 · January 8, 2021, 8:05am

@Morganh what I can see here is there is no change in both the commands which you have mentioned . what I need to replace exactly with what ?

“/opt/ngccli/ngccli_reg_linux.zip”

to
“/opt/ngccli/ngccli_reg_linux.zip”

Morganh · January 8, 2021, 8:30am

Please check the quotation.
Please use " instead of “

senbhaskar26 · January 8, 2021, 11:01am

@Morganh Hey thanks a lot… now I am able to run it successfully.
thanks for your efforts.

Topic		Replies	Views
Error when pulling a tao-toolkit docker file TAO Toolkit	14	729	July 24, 2023
How to download and use nv_dinov2 in tao dino object detection? TAO Toolkit	7	329	April 3, 2024
Viewing list of models hosted in NGC TAO Toolkit	10	797	October 12, 2021
/bin/bash: ngc: command not found TAO Toolkit	11	4834	July 7, 2022
Error in TAO-Toolkit while training TAO Toolkit	2	1116	January 4, 2022
TLT for jetson nano with jetpack 4.5 classification notebook TAO Toolkit	14	921	October 12, 2021
Tlt 3.0 TAO Toolkit	24	2718	October 12, 2021
NGC failed to download pretrained model TAO Toolkit	10	1880	March 8, 2022
Docker run error - "exec format error" TAO Toolkit	17	2206	October 5, 2021
Pull access denied for tlt-streamanalytics TAO Toolkit	11	1460	October 12, 2021

NGC failed to download the pre-trained weights in tlt model training

Related topics