I have trained a YoloV4 model then try to convert to the engine file on my Jetson Nano. The full command is
tlt-converter -k tlt-encode \
-d 3,608,608 \
-o BatchedNMS \
-e trt1.engine \
-m 2 \
-t fp16 \
-i nchw \
-p Input,1x3x608x608,1x3x608x608,2x3x608x608 \
-w 1610612736 \
yolov4_cspdarknet19_epoch_055.etlt
The error message is like,
[INFO] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[ERROR] ../builder/cudnnBuilderUtils.cpp (414) - Cuda Error in findFastestTactic: 98 (invalid device function)
[WARNING] GPU memory allocation error during getBestTactic: BatchedNMS_N
[ERROR] ../builder/cudnnBuilderUtils.cpp (414) - Cuda Error in findFastestTactic: 98 (invalid device function)
[WARNING] GPU memory allocation error during getBestTactic: BatchedNMS_N
[ERROR] Try increasing the workspace size with IBuilderConfig::setMaxWorkspaceSize() if using IBuilder::buildEngineWithConfig, or IBuilder::setMaxWorkspaceSize() if using IBuilder::buildCudaEngine.
[ERROR] ../builder/tacticOptimizer.cpp (1715) - TRTInternal Error in computeCosts: 0 (Could not find any implementation for node BatchedNMS_N.)
[ERROR] ../builder/tacticOptimizer.cpp (1715) - TRTInternal Error in computeCosts: 0 (Could not find any implementation for node BatchedNMS_N.)
[ERROR] Unable to create engine