TAO toolkit - Limit GPU Memory usage

Hello, I am following Creating a Real-Time License Plate Detection and Recognition App | NVIDIA Technical Blog. While training, I always hit enqueue.cc:118 NCCL WARN Cuda failure 'out of memory which is dGPU memory. My GPU is GF RTX 3060 Max-Q with 6GB. Is there a way to lower or limit memory usage in tao?

Or any other way to run tao directly on Jetson?

Thank You