NvJpeg Encoder in CudaSamples failed with a simple picture

Hi,all
I recently found an image that would cause nvjpegEncodeImage to fail with NVJPEG_STATUS_EXECUTION_FAILED in my code. I tried to isolate the issue to the official example and found that it was still reproducible. This image has a width of 853 and a height of 360 and is cropped from a medical image.
that is interesting.
I use the cuda version:12.0.0
the samples:cuda-samples/Samples/4_CUDA_Libraries/nvJPEG_encoder at v12.0· NVIDIA/cuda-samples · GitHub
the pics:
pic.zip (15.3 KB)
another pic(852*360)that works:
aWorkedPic.zip (13.5 KB)
also upload their raw in [[r,g,b][r,g,b]…]style:
1440pCTMip_subRaw.zip (491.3 KB)
they just different with one col,
I have some doubts Is there a problem with the image or a real bug, please feel free to comment:D

Has anyone tried or experienced this? I’m afraid I’ve made some fundamental mistakes T_T

I solved the problem that pitch of nvjpegImage_t needs to be aligned.It seems not be noted in the documentation.

the code in sample:

            nvjpegImage_t imgdesc = 
            {
                {
                    pBuffer,
                    pBuffer + widths[0]*heights[0],
                    pBuffer + widths[0]*heights[0]*2,
                    pBuffer + widths[0]*heights[0]*3
                },
                {
                    (unsigned int)(is_interleaved(output_format) ? widths[0] * 3 : widths[0]),
                    (unsigned int)widths[0],
                    (unsigned int)widths[0],
                    (unsigned int)widths[0]
                }
            };

the fixed code:

            nvjpegImage_t imgdesc = 
            {
                {
                    pBuffer,
                    pBuffer + (widths[0] + widths[0]%4)*(heights[0] + heights[0]%4),
                    pBuffer + (widths[0] + widths[0]%4)*(heights[0] + heights[0]%4)*2,
                    pBuffer + (widths[0] + widths[0]%4)*(heights[0] + heights[0]%4)*3
                },
                {
                    (unsigned int)(is_interleaved(output_format)? (widths[0] + widths[0]%4)*3:(widths[0] + widths[0]%4)),
                    (unsigned int)(widths[0] + widths[0]%4),
                    (unsigned int)(widths[0] + widths[0]%4),
                    (unsigned int)(widths[0] + widths[0]%4),
                }
            };

I showed the most important part of the code, which of course requires a larger buffer input. This may still be a common sense question that I don’t know about, and I’ll keep looking until I find someone who does.

that is not the best solution,I still dont know why.And I have not try other operation about the nvjpegImage_t 's pitch.

I have experienced problems similar to the one you described with 12.0 where it seems that nvJPEG reads data outside the image (but inside the pitch region). However, these issues seem resolved with 12.1 forward. Maybe you can check if updating CUDA/nvJPEG resolves your problem?

Another problem I encountered (but on 12.2) is also related to this, see Encode to chroma-subsampled JPEG fails with RGB data. But to know if they are the same, it depends on which command-line options you have used when running the example from the CUDA-samples. Maybe if you can share we could figure out if it’s the same problem.

Thanks for reply, I’m sorry for I reply so late.(So many bugs to deal T_T.)
Due to the configuration of this project, I have no plans to change the CUDA’s version, it is not for me to decide.

The Chroma-subsaple I found the bug is 422, I show up it in samples with 420.444 is passed,thank you for notice that.they look like same problem,but need more test and locate.

I will try more para when my bugs are clear.OTZ

If you can only use 12.0, the errors that I encountered on that versions were solved by rounding up the pitch t 4 or 8 (like you did in your code example). However, it was also necessary to set the memory outside the image (but inside this pitch) to zero, to get consistent results. Even then, it could be the case that a horizontal green and purple line would appear at the bottom of the image, after encoding, which I found no fix for (other than updating to 12.1).