NvJPEGDecoder ultra slow (Xavier & Nano)

Howdy,

Just trying out NvJPEGDecoder with intent to get h/w accelerated magic to work. The throughput is on a bit small side, at the level of 66 MP/s, whereas Xavier’s specs call for 840 MP/s for JPEG decoding.

The image is 2448x2048 JPEG at 85% quality.

nvpmodel -m 0 & jetson_clocks have been run prior.

What could be a culprit?

The code snippet:

std::ifstream fs;
fs.open("image.jpg", std::ios::in | std::ios::binary);
fs.seekg(0, std::ios::end);
uint64_t length = fs.tellg();
fs.seekg(0, std::ios::beg);
auto bytes = new unsigned char[length];
fs.read((char*)bytes, length);

NvBuffer* buffer;
uint32_t pf, w, h;
auto decoder = NvJPEGDecoder::createJPEGDecoder("nvjpegdec");
decoder->enableProfiling();
const auto startAt = std::chrono::steady_clock::now();
const auto count = 1000;
for (auto i = 0; i < count; i++) {
	decoder->decodeToBuffer(&buffer, bytes, length, &pf, &w, &h);
	delete buffer;
}
const auto endAt = std::chrono::steady_clock::now();
const auto elapsed = std::chrono::duration_cast<std::chrono::microseconds>(endAt - startAt);
const auto mpx = (double)w * h * count / 1024.0 / 1024.0;
const auto s = elapsed.count() / 1000.0 / 1000.0;
const auto rate = mpx / s;
std::cout << "Decoded " << w << "x" << h << "x" << count << "=" << mpx
	<< " MiB in " << s << " sec.; " << rate << " MP/s" << std::endl;
decoder->printProfilingStats(std::cout);

Xavier output:

Decoded 2448x2048x1000=4781.25 MP in 71.4654 sec.; 66.903 MP/s
----------- Element = nvjpegdec -----------
Total units processed = 1000
Average latency(usec) = 70848
Minimum latency(usec) = 45165
Maximum latency(usec) = 90700

Nano output (note it being faster but still falling short of advertised 600 MP/s throughput):

Decoded 2448x2048x1000=4781.25 MP in 43.128 sec.; 110.862 MP/s
----------- Element = nvjpegdec -----------
Total units processed = 1000
Average latency(usec) = 42002
Minimum latency(usec) = 36461
Maximum latency(usec) = 1496280

Thanks.

Hi,
Please try decodeToFd(). It returns DMA buffer fd.