segfault when using libnvjpeg to decode

I’m trying to use libnvjpeg directly to decode some JPEG image.

I don’t want to use gstreamer because neither jpegdec nor nvjpegdec is able to give me the decoded image in the original pixel format.

I have a JPEG image that is 1920*1080, encoded as YUV422 planar.

So I wrote a test program that will decode using jpeg_read_raw_data() calls. It works fine when I compile and link it with the regular libjpeg, but segfaults when I compile and link it with libnvjpeg.

Here is the test image:

And here is the source code:

 * To compile:
 * 1. Download the gstjpeg source tar ball:
 *    $ cd && wget
 * 2. Untar it:
 *    $ tar -xjf gstjpeg_src.tbz2
 * 3. Compile the test program:
 *    $ g++-4.8 -std=c++11 -o tegrajpeg_raw_accelerated tegrajpeg_raw.cpp -I/home/ubuntu/gstjpeg_src/nv_headers -L/usr/lib/arm-linux-gnueabihf/tegra/ -lnvjpeg -O3
 *    or
 *    $ g++-4.8 -std=c++11 -o tegrajpeg_raw tegrajpeg_raw.cpp -ljpeg -O3

/* Standard library */
#include <chrono>
#include <cstring>
#include <fstream>
#include <iostream>
#include <iterator>
#include <memory>
#include <string>
#include <vector>

/* libjpeg */
#include "jpeglib.h"

using namespace std;
using namespace ::std::chrono;

vector<char> read_jpg_file() {
    string filepath{"/home/ubuntu/yuv422_planar.jpg"};
    vector<char> jpg_buffer;

    ifstream jpg_file(filepath, ios::binary | ios::ate);

    if (jpg_file.is_open()) {

        jpg_file.seekg(0, ios::beg);

        jpg_buffer.assign(istreambuf_iterator<char>(jpg_file), istreambuf_iterator<char>());
    else {
        cerr << "Error reading yuv422_planar.jpg!" << endl;

    return jpg_buffer;

int main(int argc, char** argv) {
    // read a jpeg file
    vector<char> jpg_buffer = read_jpg_file();  //<char> because ifstream works with char, not unsigned char. :(
    unsigned char* output_buffer = new unsigned char[1920 * 1080 * 2];

    int i, j;

    unsigned char**lines[3];
    unsigned char*y[4 * DCTSIZE] = {NULL, };
    unsigned char*u[4 * DCTSIZE] = {NULL, };
    unsigned char*v[4 * DCTSIZE] = {NULL, };
    int v_samp[3];
    unsigned char *base[3], *last[3];

    // How many bytes per row of Y, U, and V data.
    const int stride[3] = {1920, 960, 960}; 

    const unsigned int height{1080};

    lines[0] = y;
    lines[1] = u;
    lines[2] = v;

    struct jpeg_decompress_struct cinfo;
    struct jpeg_error_mgr jerr;

    cinfo.err = jpeg_std_error(&jerr);

    jpeg_mem_src(&cinfo, reinterpret_cast<unsigned char*>(, jpg_buffer.size());

    jpeg_read_header(&cinfo, true);
    cinfo.raw_data_out = 1;


    v_samp[0] = cinfo.comp_info[0].v_samp_factor;
    v_samp[1] = cinfo.comp_info[1].v_samp_factor;
    v_samp[2] = cinfo.comp_info[2].v_samp_factor;

    // Starting positions of the Y, U, and V components in the output_buffer
    base[0] = output_buffer;            // Y
    base[1] = base[0] + 1920 * 1080;    // U
    base[2] = base[1] + 960 * 1080;     // V
    for (i = 0; i < height; i += v_samp[0] * DCTSIZE) { 
        for (j = 0; j < (v_samp[0] * DCTSIZE); j++) {
            lines[0][j] = base[0] + (i + j) * stride[0];
            lines[1][j] = base[1] + (i + j) * stride[1];
            lines[2][j] = base[2] + (i + j) * stride[2];

        jpeg_read_raw_data(&cinfo, lines, v_samp[0] * DCTSIZE);


    ofstream out_yuv{"/home/ubuntu/out.yuv", ::std::ios::out | ::std::ios::binary};
    out_yuv.write(reinterpret_cast<const char*>(output_buffer), 1920 * 1080 * 2);


    delete[] output_buffer;

Could somebody take a look and tell me why the libnvjpeg version would segfault? I looked at the gstreamer plug-in source code and I don’t see anything special there, so I don’t understand why it segfaults when I try to use libnvjpeg in my own code.

This is not a valid case we support on TK1. Please other users share experience.

In addition, I know that the gstreamer nvjpegdec (and hence libnvjpeg) can decode this image. I have tried using gst-launch-1.0 with nvjpegdec to decode the image and it works, although it returns the image data downsampled to I420 instead of Y42B (YUV422 planar).

I got my code working.

It looks like the implementation of jpeg_mem_src() is a bit broken in It creates a jpeg_source_mgr that is invalid for use with jpeg_read_raw_data(), causing segfaults as jpeg_read_raw_data() tries to copy jpeg data to be decoded.

So I had to manually create my own jpeg_source_mgr and implement my own init_source(), fill_input_buffer(), skip_input_data(), resync_to_restart(), and term_source() functions.

Unfortunately, the performance is still not great. It still takes 30-35 ms to decode a 1920*1080 image, whereas I know that the gstreamer plugin can do much faster than that by leveraging NVMM.