How to force the GPU to drop its cache?

I am new to CUDA programming. I have an application that has two parts A and B. B has data dependencies on A. I want to see how well B can perform without the caching effect (i.e. B should not benefit from the data that was previously brought to the cache by A). Is there any way of achieving this? I know that one possible approach is to write a kernel that pollutes the cache but this is not very elegant. Does CUDA or nvidia tool have any mechanism that allows me to do this?