how to use texture mmeory and texture cache in CUDA

Global memory is too slow, I want to use any caches I can use.

There are plenty of CUDA sample codes that demonstrate the use of texture memory/cache