How is it possible what DLSS is so fast?

DLSS uses tensor cores and can do 4x upscaling from 1080p to 4k in just 1.5ms, so how is it possible? And this happened on low tier GPU, like RTX 3060.

Please check the below links, as they might answer your concerns.


Is it DLA supported on RTX 30XX GPU and Quadro A6000?
Because then I try to convert simple test model I get error on Quadro A6000 from docker

trtexec --onnx=/weights/onnx/model.onnx --saveEngine=/weights/onnx/model-rt.trt --explicitBatch --fp16 --optShapes=input:0:8x256x256x3 --workspace=35000 --threads --dumpProfile --noBuilderCache --useDLACore=0 --allowGPUFallback --verbose
[07/27/2021-23:02:51] [E] Cannot create DLA engine, 0 not available
[07/27/2021-23:02:51] [E] Engine creation failed
[07/27/2021-23:02:51] [E] Engine set up failed

Hi @kirpasaccessory,

Please refer Support Matrix :: NVIDIA Deep Learning TensorRT Documentation to check DLA support.

Thank you.

So, only Jetson AGX Xavier has DLA block but not RTX desctop GPU. And my question not answered, how is it possible what DLSS is so fast?


Please refer NVIDIA DLSS Technology for Incredible Performance for more details. If still need further assistance request you to raise your concern on GeForce related forum.

Thank you.

But I looks on it from AI perspective to and curios how its work under the hood. So basically speed is a king of DLSS but how you optimize it?


DLSS does not related to DLA at all. We do not have much information for DLSS.
As recommended previously please post your concern on GeForce related forum to get better help.

Thank you.