Real-time Object detection on Jetson


I am novice in deep learning and currently working on object detection on Jetson TK1 with real-time video from a webcam. The two frameworks or applications I tired are

Darknet Yolo: Got 5-6 frames per second with Tiny-yolo configuration and tiny-yolo weights
Single shot multibox detector (works on caffe): got 5 fps

Has anyone used these or any other framework for object detection (classification and localization of objects in an image) for real-time video on the Jetsons? If yes, then what was the maximum frame rate achieved?

Are there better frameworks for the same? I would be suing Jetson TX1 in near future for a better performance.

Any help would be greatly appreciated.


Multimedia API package is included in R24.1 release (mentioned in the release note ). You can take a look at the object identification sample included.

Hi, the page you pasted not found, can you give another page?

The URL just has the closing parenthesis on it in need of removal. Try:

Thank you very much!

I think you are familiar with the jetson, Do you kown difference of the native or the cross-compile on the jetson? I want to use the tx1 to have a simple ball detection from a kinect video frame, and many people use the cross-compile, if the native would lower the tx1’s performance?

Any complications in the difference between native or cross compile are related to the fact that the native architecture is arm64/aarch64, but support exists in a compatibility mode for arm32/armhf. To build the kernel, you need compilers for both. To build in user space, you only need one compiler.

Having two cross compilers on a host is easy. Adding both a native and foreign compiler to a running Jetson poses many possible support issues. If you need two compilers, meaning kernel compiles, then you are better off cross compiling. If you only need a purely aarch64 or armhf user space executable, then I’d recommend just compiling directly on the Jetson.

Note that kernel code itself has built in configuration to aid cross compile, and that the kernel is essentially bare metal…which implies the kernel does not need any form of linker or system library support. Once you cross compile user space you need to add linker support and all of the libraries used by what you are building. Cross compile of a kernel has few requirements; cross compile of user space has a lot of requirements.

Summary: Kernel compile via cross compile is simple and has few requirements. User space program compile via native compile is simple…despite having more requirements to build user space programs, those requirements are already present by default if you use native compile directly on the Jetson.

I know very little about Kinect, but cross compile versus native compile won’t affect performance; the libraries linked in will have an effect, but if the libraries in a cross compile environment are identical to the running Jetson (the “sysroot” files), then there will be no difference at all.

Hi Bharat,

You can try this object localization sample for jetson tx1.

It works well on 24.2 with JetPack2.3.
This sample use TensorRT and detectNet, which may be helpful for your use case.

Thank you @linuxdev!