Argus: cuEGLStreamConsumerAcquireFrame takes too long to return

Hi,

I’m using a synchronized stereo pair and ran the Argus sample syncSensor but the performance is low. The cameras are capable of 60fps at 752x480.

I modified the example and timed the calls to ScopedCudaEGLStreamFrameAcquire. I found that for one camera it takes about 60ms to return from cuEGLStreamConsumerAcquireFrame while for the other one it takes only 1ms. Can you explain why this happen?

*** Loop ***: 66.28 ms
ScopedCudaEGLStreamFrameAcquire: 59.35 ms
ScopedCudaEGLStreamFrameAcquire: 1.09 ms
*** Loop ***: 65.08 ms
ScopedCudaEGLStreamFrameAcquire: 62.92 ms
ScopedCudaEGLStreamFrameAcquire: 1.21 ms
*** Loop ***: 69.17 ms
ScopedCudaEGLStreamFrameAcquire: 58.50 ms
ScopedCudaEGLStreamFrameAcquire: 1.36 ms
*** Loop ***: 66.15 ms
ScopedCudaEGLStreamFrameAcquire: 60.45 ms
ScopedCudaEGLStreamFrameAcquire: 0.15 ms

Thank you

@nzkspdr
Could check if the no cuda sample have the same problem to break down the root cause.

Hi ShaneCCC,

I added the timer to the Multisensor sample on both the PreviewConsumer and JPEGConsumer threads. I get the following results:

JPEGConsumerThread: 120.47 ms
PreviewConsumerThread: 1.90 ms
JPEGConsumerThread: 150.67 ms
PreviewConsumerThread: 0.76 ms
JPEGConsumerThread: 115.27 ms
PreviewConsumerThread: 0.46 ms

This happens even if I commented out JPEG->writeJPEG to discard disk i/o time.

My ScoptedTimer is like this in case you want to try on your end:

#include <sys/time.h>
class ScopedTimer
{
public:
inline ScopedTimer(const std::string& name) :
name(name)
{
gettimeofday(&begin, NULL);
}

inline ~ScopedTimer()
{
    gettimeofday(&end, NULL);
    double elapsedTime;
    elapsedTime = (end.tv_sec - begin.tv_sec);
    elapsedTime += (end.tv_usec - begin.tv_usec) / 1000000.0; // us to s

    fprintf(stderr, "%s: %.2f ms\n", name.c_str(), elapsedTime*1000.0);
}

private:
timeval begin;
timeval end;
std::string name;
};

Thank you

@nzkspdr
So what your concern is why the ScopedCudaEGLStreamFrameAcquire need more time than others?
That because the frame send to GPU do post processing. I believe the post processing is the most time consumer.