The main cache is L2 and possibly L1 depending on what GPU you are running on.
The L2 cache cannot be disabled in any way.
The L1 cache, if it would normally be enabled, can be disabled at compile time using a particular compile switch for PTXAS:
added to the compile command line.
There are other caches on the device which also cannot be globally disabled in any way, such as the constant cache, read-only cache, etc. Code that is written to use these features will use those features, and the only way to disable their use would be to re-write the code.