The tlt CLI uses Docker containers under to hood to train and prune a model. We have in place DGX boxes as compute node part of the larger HPC infrastructure with a caveat that using Docker is forbidden. I have earlier used TF-TRT and Triton containers from NGC as singularity files and they have always worked fine. For some reason, there is no documentation to run TLT with singularity containers. The post-installation steps mentioned for non-root usage still point to the docker’s non-root feature enabling which in itself has some root (sudo *) dependencies. Is there a way to make tlt work with singularity containers?
Reference: Tlt-streamanalytics training in Singularity - #4 by Morganh
Please check if it helps for you.
One more tip, please try to pull the TLT 3.0 docker directly instead of using tlt-launcher.
docker pull nvcr.io/nvidia/tlt-streamanalytics:v3.0-dp-py3
Example:
morganh@dl:~$ docker pull nvcr.io/nvidia/tlt-streamanalytics:v3.0-dp-py3
v3.0-dp-py3: Pulling from nvidia/tlt-streamanalytics
Digest: sha256:3e20634106145588534caf2887fdc1093e0e167a0933b0a993e5a077684bd89e
Status: Image is up to date for nvcr.io/nvidia/tlt-streamanalytics:v3.0-dp-py3
nvcr.io/nvidia/tlt-streamanalytics:v3.0-dp-py3morganh@dl:~$ docker run --runtime=nvidia -it -v /home/morganh/demo:/workspace/demo nvcr.io/nvidia/tlt-streamanalytics:v3.0-dp-py3 /bin/bash
–2021-04-23 10:05:27-- https://ngc.nvidia.com/downloads/ngccli_reg_linux.zip
Resolving ngc.nvidia.com (ngc.nvidia.com)… 13.225.93.33, 13.225.93.84, 13.225.93.94, …
Connecting to ngc.nvidia.com (ngc.nvidia.com)|13.225.93.33|:443… connected.
HTTP request sent, awaiting response… 200 OK
Length: 24976582 (24M) [application/zip]
Saving to: ‘/opt/ngccli/ngccli_reg_linux.zip’ngccli_reg_linux.zip 100%[====================================================================================================>] 23.82M 32.0MB/s in 0.7s
2021-04-23 10:05:28 (32.0 MB/s) - ‘/opt/ngccli/ngccli_reg_linux.zip’ saved [24976582/24976582]
Archive: /opt/ngccli/ngccli_reg_linux.zip
inflating: /opt/ngccli/ngc
extracting: /opt/ngccli/ngc.md5root@9f4979ebd897:/workspace# ls
EULA.pdf README.md demo examplesroot@9f4979ebd897:/workspace# cd demo/
root@9f4979ebd897:/workspace/demo# mask_rcnn train -e spec.txt -d /workspace/demo/result -k nvidia_tlt