Decrease lag while capturing video from webcam with opencv

Hi, I am a total new bee to the nvidia jetson tk1. I am trying to do stereo vision on the nvidia jetson. But when i just try to get video from 2 webcams and not process for stereo vision I get a lot of lag. Is there any way to reduse the lag?

Lag can be from many places. To start, have you flashed yet to R21.4? What is the exact camera setup, including type of cable and a URL to the camera specs? Last, what command are you using to process the camera(s)?

I have R21.4 and I am using these webcams and I am using this to get frames

[code]VideoCapture cap(0); // open the default camera
if(!cap.isOpened()) // check if we succeeded
return -1;

Mat frame;
    cap >> frame; // get a new frame from camera

The web page does not give any details on that camera, but it looks like it is USB2, so USB3 would not improve anything. The first thing I’d try is just setting USB to make sure it doesn’t sleep, along with using performance mode (at least while testing). See:

Next I’d try to find the process which is running your code, and renice it to priority -1 (default is 0, -1 has higher priority).

If this does not help you can describe performance changes after those new settings; this would help narrow down any performance issues still remaining. One particular issue which can’t be answered without knowing more is what program is using that code and how it is linked to use GPU or not…but with the other simpler changes to test this may not matter.

I already have performance mode enabled along with no sleep thing. Apart from that I have a usb hub connected to the jetson and the 2 cameras are connected. So I have usb 3.0 enabled. I am not using the gpu because I couldnt find anything in opencv to capture video with gpu instead of cpu. Is it possible to do that? And before I set the jetson to performance mode it would take over 2 seconds to capture a image from each camera.

If the camera is USB2 then USB3 will have no effect. Having two cameras on the same USB connector (including two cameras on a HUB going to the same connector) might cause lag, but it’s hard to say without camera specs…even then a camera with only 0.3 megapixels won’t consume much bandwidth, so a pair of them wouldn’t be a noticeable performance hit on average (although latency would increase).

Have you tried changing the process priority? This was the “renice” command I mentioned (renice to -1 priority will result in lower latency if there is user space competition for resources…everything in user space normally has priority 0). What program is that code in?

For GPU/CUDA use, pipelines through gstreamer can be used. If there is lag before data reaches the application processing the data then faster GPU data processing won’t matter much…more information is needed on the complete pipeline of cameras and programs working on the cameras. From cables, hubs, connectors, on through any programs used in the camera process, more details would help.

I am capturing images at 640x480 resolution. And my code is in c++ all it does is capture a image and display it in a loop. I wouldn’t mind buying a webcam with h264 encoding and use gstreamer to decode it if its faster that way. And am I sorry but how do I use renice I never used it before?

Ok I got it. I set the nice value to -1. It was taking about 60% of the cpu and it looks a lot better. But there is still about half a second lag. Is there any way I can get rid of that too.

In the *nix world (including linux and L4T variants) the priority of a process is known as the “nice level”. Level 0 is where all user apps are set by default, -20 is the highest priority and should only be used for certain very critical processes…using this could cause problems if for example you run into something like priority inversion. A level of +20 makes a very low priority, and might be used on something like an index creation for a hard disk in the middle of the night…it’d be something which nothing depends on, and which you wouldn’t care if it takes all night. Because so much runs at a nice level of 0, changing it to -1 can have a dramatic effect without causing any kind of system issue…useful for latency when other “ordinary” processes are delaying yours from starting.

You can find your process ID via many ways, e.g., command line “ps” (man ps), top, or htop (htop is my favorite). Sometimes the process viewing commands offer a means to renice as well, e.g., htop. Without anything special, the command line will literally have a command “renice” (man renice). Or you can start a command directly with the “nice” command (man nice). A nice level of -1 should be quite significant if you are getting latency via competition with other user space threads.

Other latency might just mean you need to profile your viewer software, e.g., gprof (used with the right compiler flags, a file can be created showing time in each function…“man gprof”). Without this you won’t really know if the issue is the system providing data with latency, or if it is an issue of the program efficiency (e.g., constant open/close instead of continuous open of a data source). The nice thing (pun intended) is that the other tests above on priority require very little effort to explore.

You may wish to check out CUDA 6.5 for 32-bit Jetson, or simply putting as much as you can through gstreamer (which nVidia supports for some multimedia acceleration through CUDA on Jetson/L4T).