TLT train maskrcnn model with Mapillary Vistas Dataset failed on CUDA_ERROR_OUT_OF_MEMORY: out of memory

To narrow down, please double check below.

  1. Do your training meet below requirement?
  • Input size : C * W * H (where C = 3, W > =128, H >=128 and W, H are multiples of 32)
  • Image format : JPG
  • Label format : COCO detection
  1. Can you try to train with the public dataset mentioned in the jupyter notebook again?
  2. Try to reboot
  3. Try to train with a smaller network
  4. Try to train with smaller image_size

More reference for OOM issue:
Maskrcnn:

Other networks