Please provide the following information when requesting support.
• Hardware (T4/V100/Xavier/Nano/etc) A100
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) UNET
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
format_version: 1.0
tlt_version: 3.0
docker_tag: nvcr.io/nvidia/tlt-streamanalytics:v3.0-dp-py3
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
Hi, I have an old TLT UNET model trained using above-mentioned version of TLT. When I try to redo the training today, every tlt command (tlt unet train / tlt unet evaluate) is giving me below output. The container exit immediately without any error message.
For multi-GPU, change --gpus based on your machine.
2022-09-26 19:30:56,152 [WARNING] tlt.components.docker_handler.docker_handler:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the ~/.tlt_mounts.json file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
2022-09-26 19:30:58,492 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.
Do you have any ideas on what might causing this issue?