Use hardware to compress/decompress memory blocks in CUDA kernel


does anyone know if it is possible to use the hardware to compress/decompress some memory-blocks within a cuda kernel - lossless? My idea is to use the video or jpg functions/hardware to use for compression and decompression. I know that the results may not be great, but a few percent would be a great result. Maybe there is already something available that works for compressing and decompressing general data within a kernel?

If not, is it somehow possible to call the functions of the nvJPG library within a cuda kernel?