Object Detection with DetectNetv2: error when evaluating the model

Hi,

I am trying to run the instructions in here to train the DetectNetv2 model for object detection on my PC that has an Nvidia RTX 2060 SUPER 8GB GPU.

The training part works fine for me, however, when I try to run the evaluation using the following command from the provided notebook:

!tlt-evaluate detectnet_v2 -e $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt\
                           -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/model.step-32250.tlt \
                           -k $KEY

I get the following error:

UnknownError (see above for traceback): Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node resnet18_nopool_bn_detectnet_v2/conv1/convolution (defined at /opt/nvidia/third_party/keras/tensorflow_backend.py:93) ]]
	 [[node strided_slice_90 (defined at ./detectnet_v2/model/utilities.py:57) ]]

What I have tried but did not work for me:

  • Restarting my PC
  • Using smaller batch size (although I am using the same batch size in training, so it should not be a problem).

What I have not tried yet

In the beginning of the training notebook I see the following sentence:

When using the purpose-built pretrained models from NGC, please make sure to set the $KEY environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.

Maybe this is the cause of the issue. Currently I have set the $KEY to a random string, but I was wondering if I should get a proper key from somewhere? From the text above I cannot figure out how to get / generate the key.

Any help would be appreciated.

Add:

%env TF_FORCE_GPU_ALLOW_GROWTH=true

1 Like

This solution worked. Thank you!