NVJPEG issues and inconsistencies with transcoding

I have followed the NVJPEG transcoder example from NVIDIA documentation, and I am observing some strange results with CUDA 11.6.

Specifically, the function nvjpegEncoderParamsCopyHuffmanTables doesn’t seem to actually copy Huffman tables to encoded image. I have verified using JPEGSnoop tool that Huffman tables in the encoded image are not only reordered (DC->AC->DC->AC instead of DC->DC, AC->AC), but also different compared to the source image.

Source image I used for testing was created using Photoshop (dimensions 1155 x 1140) and the same image was saved as Baseline Standard, Baseline Optimized, and as Progressive with 3, 4, and 5 scans, all of them with Quality 12.

Where Photoshop produces 107.50 KB baseline standard and 31.61 KB baseline optimized image, NVJPEG transcoding of those images with copied quantization and Huffman tables results in 62.77 KB output image.

Where Photoshop produces 24.38 KB progressive JPEG with 3 scans, NVJPEG transcoding of that image with copied quantization and Huffman tables results in 131.68 KB output image.

Furthermore, I have encountered one image which when transcoded in this way has corruption in the output. If I re-save that image with Photoshop and transcode it, then it is ok just like any other images I tested with.

Moreover, transcoding CMYK JPEG images is not possible using the NVIDIA transcoder sample code, because the NVJPEG library is incorrectly returning NVJPEG_CSS_UNKNOWN subsampling when parsing CMYK JPEG image. Here we get to the NVJPEG API design issue – chroma subsampling enum does not have subsampling values for 4 component streams such as YCCK. I am not sure which subsampling factors are supported for CMYK JPEG (YCCK), but at least 4444 should be supported (which would mean no subsampling on any component).

If you call nvjpegEncoderParamsSetSamplingFactors() passing NVJPEG_CSS_UNKNOWN which you get from nvjpegGetImageInfo() or nvjpegJpegStreamGetChromaSubsampling() it will of course error out, and I am not sure that even copying quantization and Huffman tables would work in that case.

Finally, NVJPEG still does not decode CMYK JPEG correctly (colors are totally wrong) even though I reported it here long time ago.

If someone from NVIDIA team can respond I can send them the CMYK image and the problematic image that gets corrupted on transcoding for testing along with a Visual Studio project in a private message.

It’s probably best to file a bug. The bug handling team will likely want a full, complete example code that demonstrates what you are doing. They will also be able to provide you with a method to send file attachments and whatnot if needed.

@Robert_Crovella Ok, I’ll do that.

@Robert_Crovella I have submitted a bug report with a reproducible test case as you suggested. Behavior I observed has been confirmed. Bug ID is 3632945 in case anyone else hits any of the issues I mentioned here.

Any updates?

They said it should be fixed in the next major CUDA release. I guess that will be when 4000 series is launched.

Is it fixed?

Depends on what part of my complaint do you mean.

For example nvjpegEncoderParamsCopyHuffmanTables has been “fixed” by being deprecated, because it really doesn’t make sense. If you want optimized Huffman tables there is a way to request it from the encoder.

The progressive decoding corruption issue was fixed in 12.0 if I am not mistaken (check release notes to be sure).

CMYK to RGB color conversion will hopefully become public in 12.x release.

Can you help me with why this jpeg is not HW decoded in ffmpeg? #10386 (CUVID decoder fails to process specific JPGs) – FFmpeg

@val.zapod.vz Your err.jpg image from that ffmpeg ticket is decoded just fine using my own image viewer which uses CUDA hardware accelerated JPEG decoding. Note that I compiled it using CUDA Toolkit 12.1 which contains some fixes for progressive image decoding.

Progressive image decoding! Yep, THAT is the thing. Thanks.

@umermian057 Thanks for the effort, but you are a year late with your suggestions. If you bothered to read my other posts in this thread you could have saved yourself the trouble of typing the response.

@val.zapod.vz Here’s what happens with progressive JPEG:

[mjpeg @ 000002489b800080] NVDEC capabilities:
[mjpeg @ 000002489b800080] format supported: yes, max_mb_count: 67108864
[mjpeg @ 000002489b800080] min_width: 64, max_width: 32768
[mjpeg @ 000002489b800080] min_height: 64, max_height: 16384
[mjpeg @ 000002489b800080] marker parser used 17 bytes (136 bits)
[mjpeg @ 000002489b800080] marker=c4 avail_size_in_buf=405467
[mjpeg @ 000002489b800080] class=0 index=0 nb_codes=9
[mjpeg @ 000002489b800080] class=0 index=1 nb_codes=7
[mjpeg @ 000002489b800080] class=1 index=0 nb_codes=35
[mjpeg @ 000002489b800080] class=1 index=1 nb_codes=73
[mjpeg @ 000002489b800080] class=1 index=2 nb_codes=35
[mjpeg @ 000002489b800080] marker parser used 246 bytes (1968 bits)
[mjpeg @ 000002489b800080] escaping removed 345352 bytes
[mjpeg @ 000002489b800080] marker=da avail_size_in_buf=405219
[mjpeg @ 000002489b800080] component: 1
[mjpeg @ 000002489b800080] component: 2
[mjpeg @ 000002489b800080] component: 3
[mjpeg @ 000002489b800080] marker parser used 12 bytes (96 bits)
[mjpeg @ 000002489b800080] escaping removed 322521 bytes
[mjpeg @ 000002489b800080] marker=da avail_size_in_buf=345328
[mjpeg @ 000002489b800080] component: 2
[mjpeg @ 000002489b800080] marker parser used 8 bytes (64 bits)
[mjpeg @ 000002489b800080] escaping removed 292408 bytes
[mjpeg @ 000002489b800080] marker=da avail_size_in_buf=322411
[mjpeg @ 000002489b800080] component: 3
[mjpeg @ 000002489b800080] marker parser used 8 bytes (64 bits)
[mjpeg @ 000002489b800080] escaping removed 185675 bytes
[mjpeg @ 000002489b800080] marker=da avail_size_in_buf=292321
[mjpeg @ 000002489b800080] component: 1
[mjpeg @ 000002489b800080] marker parser used 8 bytes (64 bits)
[mjpeg @ 000002489b800080] escaping removed 183648 bytes
[mjpeg @ 000002489b800080] marker=da avail_size_in_buf=185350
[mjpeg @ 000002489b800080] component: 2
[mjpeg @ 000002489b800080] marker parser used 8 bytes (64 bits)
[mjpeg @ 000002489b800080] escaping removed 181078 bytes
[mjpeg @ 000002489b800080] marker=da avail_size_in_buf=183644
[mjpeg @ 000002489b800080] component: 3
[mjpeg @ 000002489b800080] marker parser used 8 bytes (64 bits)
[mjpeg @ 000002489b800080] escaping removed 763 bytes
[mjpeg @ 000002489b800080] marker=da avail_size_in_buf=181059
[mjpeg @ 000002489b800080] component: 1
[mjpeg @ 000002489b800080] marker parser used 8 bytes (64 bits)
[mjpeg @ 000002489b800080] marker=d9 avail_size_in_buf=0
[mjpeg @ 000002489b800080] decoder->cvdl->cuvidDecodePicture(decoder->decoder, &ctx->pic_params) failed -> CUDA_ERROR_INVALID_IMAGE: device kernel image is invalid

By looking at those “escaping removed X bytes” messages in the debug log above it’s pretty clear that decoder is skipping over the image components because it doesn’t understand progressive scan. It is then left with marker=d9 avail_size_in_buf=0 which means it reached EOI marker (End Of Image) without actually decoding anything.

Note however, that ffmpeg here for some reason assumes that JPEG input file is a video in MJPEG format and it treats it accordingly – it tries decoding it using CUVID library (NVDEC in particular). It does not use NVJPEG library which is what this thread was about.

I would say there are three distinct issues here:

  1. ffmpeg (IMO incorrectly) deciding that a single file ending in .jpg extension is a video in MJPEG format and passing it to the mjpeg decoder.

  2. NVDEC failing to decode progressive JPEG (which is kind of understandable because as far as I know progressive is not used for MJPEG encoding).

  3. ffmpeg (particularily image2 library) saying Use a pattern such as %03d for an image sequence or use the -update option (with -frames:v 1 if needed) to write a single image when you didn’t pass a video but a single image as input.

So if you asked me, I wouldn’t expect NVIDIA to fix this because it’s not really a bug. On the other hand, ffmpeg should probably fix this by properly detecting single .jpg image and either decoding it using NVJPEG (which works with progressive scan), or by falling back to software decode.

We do not have a decoder called jpeg. Jpeg is decoded using [motion] mjpeg decoder. It decodes fine without -hwaccel cuda.

@val.zapod.vz Well good luck convincing NVIDIA to support progressive scan decoding in NVDEC then.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.