I have been playing around with the camera, generic CUDA, and the hardware image and video features of the TX1. Great fun so far :)
I’d like to get some numbers showing how well OpenCV is accelerated by the TX1. I know that there is a compile of OpenCV that comes with the JetPack and have that all setup. Unfortunately tools such as the opencv_perf_gpu do not seem to be packaged anywhere with the JetPack installed OpenCV.
I’ve compiled OpenCV 2.4.12 with CUDA support in the hopes to get at opencv_perf_gpu and the metrics it can provide. However, running it with --perf_impl=cuda I get slightly slower performance than --perf_impl=plain (ie, CPU). At the start of a run I see the GPU INFO like repeated twice, once for the GM208 and once for a “Run on OS Linux x32”. The later message makes me feel like the CUDA is being executed on the CPU instead of the GPU.
FWIW the full line to execute is:
$ ./opencv_perf_gpu --gtest_filter=Sz_KernelSz_Fitlers_Filter2D.Filters_Filter2D/69 --perf_impl=cuda
I have also tried doing the nasty and hand compiling opencv_perf_gpu from the source and forcing the use of the provided system headers and libraries. I might try that again. It was only allowing one of the solvePnPRansac tests as I had to play around a bunch to get an implementation into the executable from picking off the right sources etc.
Any hints on this would be great! It is probably best to try to get understandable numbers at the raw OpenCV level before trying to get ROS to take advantage of the hardware.