Jetson Nano: RuntimeError: CUDA error: the launch timed out and was terminated

Hi there!
I want to predict Instance-Segments using FAIR’s Detectron2. Executing the Script, for the prediction, on Google Colas works perfectly. When I execute it on my Jetson Nano Development Kit I get the following error: RuntimeError: CUDA error: the launch timed out and was terminated. I already tried to use the Jetson Nano in Headles Mode, but the error still occurred.

The full Error:

Traceback (most recent call last):
  File "predict.py", line 21, in <module>
    outputs = predictor(im)
  File "/home/lukas/Desktop/repositorys/TrashTron2/env/lib/python3.6/site-packages/detectron2/engine/defaults.py", line 317, in __call__
    predictions = self.model([inputs])[0]
  File "/home/lukas/Desktop/repositorys/TrashTron2/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/lukas/Desktop/repositorys/TrashTron2/env/lib/python3.6/site-packages/detectron2/modeling/meta_arch/rcnn.py", line 146, in forward
    return self.inference(batched_inputs)
  File "/home/lukas/Desktop/repositorys/TrashTron2/env/lib/python3.6/site-packages/detectron2/modeling/meta_arch/rcnn.py", line 204, in inference
    proposals, _ = self.proposal_generator(images, features, None)
  File "/home/lukas/Desktop/repositorys/TrashTron2/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/lukas/Desktop/repositorys/TrashTron2/env/lib/python3.6/site-packages/detectron2/modeling/proposal_generator/rpn.py", line 478, in forward
    anchors, pred_objectness_logits, pred_anchor_deltas, images.image_sizes
  File "/home/lukas/Desktop/repositorys/TrashTron2/env/lib/python3.6/site-packages/detectron2/modeling/proposal_generator/rpn.py", line 511, in predict_proposals
    self.training,
  File "/home/lukas/Desktop/repositorys/TrashTron2/env/lib/python3.6/site-packages/detectron2/modeling/proposal_generator/proposal_utils.py", line 97, in find_top_rpn_proposals
    if not valid_mask.all():
RuntimeError: CUDA error: the launch timed out and was terminated

Hi,

Please note that Nano limits each kernel to be less than 5 seconds.
If a job runs over 5 seconds, it will be killed by the watchdog and return a timeout error.

Thanks.

Is there a way to disable the watchdog?

Hi,

Yes, but please note that it might cause an issue if the kernel doesn’t respond due to some long-run task.

Please find the instructions below:

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.