Are nppi_compression functions synchronous?

I tried nppiEncodeHuffmanScan_JPEG_8u16s_P3R and it seems too slow to be async. Is is synchronous? Is there a way to have async huffman encoding?


Looking at the code I can’t see any calls that would make those functions synchronous.

Thank you for your reply!

I didn’t tested it much, just measured execution time in the jpegNPP example. I just assumed that this function is synchronous because it took > 2ms (async functions like cudaMemcpyAsync usually takes much less on my machine).

Also, there is nLength return parameter (byte length of the huffman encoded JPEG scan), I assumed it cannot be known before the encoding is complete. Is that false?

(Btw, by “async” I mean that it returns before completion, like cudaMemcpyAsync or kernel invocations)

Thanks you!

Looks like the “nppiEncodeHuffmanScan_JPEG_8u16s_P3R” is synchronous.

I have put it in a code which renders frames using DX9 then scales/color-converts/compress/copy to host, and this function blocks until all previous GPU work is done. I normally use two cudaStreams, (to overlap work on one frame and copy of previous frame to host), so I tried with “nppSetStream” but it still blocks.

I also tried to allocate nScanLength parameter on GPU but nppiEncodeHuffmanScan don’t accept it, returns an error.

Is there anything I’m doing wrong or it’s a correct behavior of this function?