Turing Memory Compression question

In the https://devblogs.nvidia.com/nvidia-turing-architecture-in-depth/ document it mentions something about “Turing Memory Compression”.

Is there a possibility to use it through my CUDA code to optimize the memory footprint of my


Bump :)

My interpretation of that description is that the compression is applied when moving data around different parts of the memory subsystem, and not to data just “sitting” there so to speak. That’s why they highlight the bandwidth achieved by the new compression method instead of “now you can fit 4GB worth of data while just using 3GB of memory”, which I believe is the direction of your question?

Yup, I wondered whether there’s some sort of compression hardware that would compress the data and I could use it from my CUDA app.