cuda sample using __ldg()..?

Hello,

Does one of the cuda samples use __ldg() somewhere in the code…?

I did a quick find on Windows and it doesn’t seem like any of the CUDA 5.5 or 6.0 samples use it.

Here’s some past threads that lalk about that instruction, in case they happen to answer your question:
https://devtalk.nvidia.com/default/topic/527670/why-l1-cache-hit-ratio-become-zero-on-k20-/
https://devtalk.nvidia.com/default/topic/673271/const-restrict-read-faster-than-constant-/
https://devtalk.nvidia.com/default/topic/638031/ldg-versus-textures/

Here’s a slideshow with a small example:
http://on-demand.gputechconf.com/gtc/2013/presentations/S3011-CUDA-Optimization-With-Nsight-VSE.pdf
Pretty much the same as above, maybe on different hardware? looked the same at first glance:
http://calcul.math.cnrs.fr/IMG/pdf/CUDA-Optimization-Julien-Demouth.pdf

library that templates ldg instruction for other non-native types:
https://github.com/BryanCatanzaro/generics