I have a problem that I was trying to solve for days now but with no success.
The problem goes as follows:
- Cheap Logitech web camera
- Jetson TX1 on the Auvidea J120 interface board
The pipeline goes like this: camera reads the image using OpenCV3 and the image
is then sent to the Neural Network giving the final results which can be
plotted on the original image (Object Detection problem).
I have timed the network inference time and it is around 0.4s. The problem is
the time difference between the captured image and the processed image.
Here’s a concrete example to make it more clear:
- Image is read from the camera at time t=0s.
- Image goes through the Neural Network. This takes 0.4s as mentioned above.
- So to compute: vid.read(), result = predict(frame), show(result) it takes around 0.4s
So although my model takes 0.4sec to take image and predict, the output image that I can see on the screen is delayed for 2 seconds.
My first thought was that OpenCV was the bottleneck so I’ve written the script
which only reads from the camera and displays the image. There was NO
bottleneck whatsoever so the problem is not in OpenCV.
Another speculation is that as the Jetson is overloaded with Network
computations (GPU util is at 99%), it cannot read images fast enough so the
time difference appears.
CPU is not the bottleneck because the CPU util doesn’t go over 30% on all
The final test was run on my Macbook 2013 (Intel i7 CPU, 16GB RAM,
Nvidia Geforce 650m).
The interesting part is that everything was working as expected (no extra time
Image reading and prediction code looks like the following:
vid = cv2.VideoCapture(0)
if not vid.isOpened():
raise IOError((“Error openning camera feed!”))
Skip frames until reaching start_frame
if start_frame > 0:
image = vid.read()
# Network processing
result = network.inference(image)
I hope you can help me out because I am all out of ideas :(