(Optical Flow SDK Help) Building TensorRT Detector Engine Fails

I am following the documentation on how to install the Optical Flow SDK and run the sample applications. I got to the step of building the TensorRT detector engine. When I run the specified command

trtexec --onnx=/home/Downloads/OpticalFlowSDK/NvOFTracker/NvOFTSample/detector/models/yolov3.onnx --saveEngine="yolov3.trt"

I get a bunch of output but it ends in the error

Cuda failure: unknown error
Aborted (core dumped)

System and dependencies:

  • M1 Mac running Ubuntu in Parallels Desktop.
  • CUDA 11.1
  • CuDNN 8.1
  • TensorRT 8.2.1.8 (have also tried 7.2.1.6 with the same result). The documentation said to use 7.2.3, but for some reason that version doesn’t have a package for ARM architecture?

Does anyone have any insight? The core dumped is not giving me much to reason about.

Hello @hzwus and welcome to the NVIDIA developer forums!

Which NVIDIA GPU are you using and how do you connect it to the Mac?
If you are in a terminal in Ubuntu, can you execute “nvidia-smi” and share that output here?

The ARM support of CUDA is “sbsa”, for server based system architecture, which might not be completely compatible with the M1 ARM architecture.

Hi @MarkusHoHo , thanks for the reply. I definitely made a rookie mistake here and neglected to consider the NVIDIA GPU requirement. “nvidia-smi” gives me “NVIDIA-SMI has failed because it couldn’t connect with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.”

Do you have any suggestions for the most practical/affordable way for a student to utilize the Optical Flow SDK?

If you are unable to afford a dedicated PC with an NVIDIA GPU you could consider just adding an external GPU to your setup. I personally am using a Razor Core X with an RXT GPU, but there are also more affordable enclosures available. They do also work with Mac Books so this should be a viable option. But keep in mind that currently there is no dedicated support for the M1 ARM architecture, since all the CPU bound part of the SDK still needs to run on the Host machine, only the CUDA parts will be executed on the GPU.

Another option is of course to look into renting Server time on Cloud providers like AWS or Google Cloud. They provide dedicated resources especially for GPU intensive workloads.

In terms of capabilities, you won’t need the latest high-end cards for good results. RTX 20 series should still suit you well or RTX 3050/3060 GPUs which are also coming down in price right now.

I hope that helps.

This topic was automatically closed after 12 days. New replies are no longer allowed.