Hello,
Me and my team we are working on computer vision system that runs on Jetson Xavier NX. It is very important for us to have low latency from the image being taken to it being available in our code.
The system utilizes Raspberry Pi Camera v2 connected to the board with CSI. At first our system utilized code that implements OpenCV and GStreamer to launch the video pipeline in a similar manner as below:
pipeline = "nvarguscamerasrc sensor_mode=2 ! nvvidconv flip-method=0 ! video/x-raw, width=1920, height=1080 ! nvvidconv ! appsink";
VideoCapture cap;
cap.open(pipeline, CAP_GSTREAMER);
We conducted Glass2Glass latency tests to establish the speed of data transfer between the camera and the machine and results were unsatisfying: 90ms per frame .
Therefore we decided to check what can we gain by using pure libargus from Jetson Multimedia API. As far as our knowledge reached we understood that by using libargus to communicate with the camera we can benefit from Nvidia hardware acceleration. We utilized a sample code for argus which can be found in samples for Jetson Multimedia API, the sample we tested was called argus_oneshot. Below I’m showing a piece of the source code modified for measuring (as we believe) the time it takes this solution to gather a frame from the camera (data transfer):
for (size_t i = 0; i < 30; i ++) {
typedef std::chrono::high_resolution_clock Clock;
auto t1 = Clock::now();
uint32_t requestId = iSession->capture(request.get());
EXIT_IF_NULL(requestId, "Failed to submit capture request");
/*
* Acquire a frame generated by the capture request, get the image from the frame
* and create a .JPG file of the captured image
*/
Argus::UniqueObj<EGLStream::Frame> frame(
iFrameConsumer->acquireFrame(FIVE_SECONDS_IN_NANOSECONDS, &status));
auto t2 = Clock::now();
float currentDurration = float(std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1).count()) / 1000.0f;
std::cout << "Time for the frame to arrive: " << currentDurration << " ms" << std::endl;
EGLStream::IFrame *iFrame = Argus::interface_cast<EGLStream::IFrame>(frame);
EXIT_IF_NULL(iFrame, "Failed to get IFrame interface");
EGLStream::Image *image = iFrame->getImage();
EXIT_IF_NULL(image, "Failed to get Image from iFrame->getImage()");
t2 = Clock::now();
currentDurration = float(std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1).count()) / 1000.0f;
std::cout << "Time taken for one iteration: " << currentDurration << " ms" << std::endl;
EGLStream::IImageJPEG *iImageJPEG = Argus::interface_cast<EGLStream::IImageJPEG>(image);
EXIT_IF_NULL(iImageJPEG, "Failed to get ImageJPEG Interface");
status = iImageJPEG->writeJPEG(FILE_PREFIX "argus_oneShot.jpg");
EXIT_IF_NOT_OK(status, "Failed to write JPEG");
printf("Wrote file: " FILE_PREFIX "argus_oneShot.jpg\n");
}
We were running it in loop to see if the results can improve after a while. We measured the time for two situations:
- from capture to acquiring a frame
- from capture to getImage()
This allowed us to not only see the overall time but also how much it takes the method getImage() to process.
Below are the results:
Time for the frame to arrive: 237.22 ms
Time taken for one iteration: 237.303 ms
They are even slower than what we achieved using OpenCV solution. Here are our questions:
- Does utilizing libargus with CSI camera like Raspberry Camera v2 allow us to benefit from Nvidia hardware acceleration for video processing ? What can be the lowest latency for the data transfer with 2 lanes camera communication?
- If the answer for the first question is: Yes!, then what are we doing wrong that we are getting such a slow time?
- Does anyone argue that we measured the data transfer with libargus in a correct way?
- Can we use OpenCV launching a proper pipeline with GStreamer to benefit from hardware acceleration?
- What does the getImage() method do? We couldn’t really find a source code for that.
By the way we are aware that moving to 4 lanes communication can bring some improvements but right now it is impossible for us.