Flushing Instruction Cache on GPU

tugrul · June 2, 2010, 5:48pm

Hello everyone,

Does anyone know how to flush the instruction cache, either using the debugger interface provided by libcuda or through some operation inside the kernel function? I realize it may be impossible, but I want to know if there is a way.

I know I am not supposed to mess with the code on the device, but I want to experiment with modifying cuda code after kernel launch. I did find a way to write to code space, but the instruction cache prevents me from actually executing the modified instructions.

I appreciate any help you can provide

tmurray · June 2, 2010, 5:53pm

Self-modifying code is not supported and I really don’t think there’s any way for a user to flush the icache…

Keldor314 · June 3, 2010, 7:44am

Well, you could always try to overflow the cache by running a long routine consisting of a million NOP’s or so. That would presumably cause whatever code was in the cache to be evicted, thus causing it to have to reload from your modified instruction space once you actually try to execute that part of the code.

There’s also a small chance that kernel invocations flush the cache. The actual internals of kernel calls would largely determine if this would work or not, since an obvious optimization would be to not flush the cache if it can be avoided.

tugrul · June 3, 2010, 2:56pm

Thank you guys.

Executing a bunch of NOP’s would work, but the same problem is still valid: How will I insert NOP’s without flushing the instruction cache first? The device will still execute whatever is in the cache, and I don’t want that. NOP’s would only work if I inserted them before launching the kernel. Am I missing something?

Sarnath · June 3, 2010, 3:34pm

tugrul,

How did you even find the global memory location where instructions are stored??? That looks like a good effort! (scanning for instruction patterns in memory???)

tugrul · June 4, 2010, 9:21pm

Well, I had to read portions of libcuda assembly.

I downloaded the cuda-gdb code from [url=“http://ftp%3a%2f/download.nvidia.com/CUDAOpen64/”]ftp://download.nvidia.com/CUDAOpen64/[/url] . It comes with a header file called cudadebugger.h. There is a function called readCodeMemory declared in this header file, and the actual implementation resides in libcuda.so. This readCodeMemory function eventually calls memcpy to copy the code from device memory to host. I changed the parameters to this memcpy operation, and voila! I copied from host to device, overwriting code on device. Since I could not flush the instruction cache it didn’t really matter much. I tried modifying some code that did not fit in the cache, and the output changed as I expected. So, now I am trying to flush the instruction cache.

wwa · June 4, 2010, 10:11pm

See Reverse engineering of the CUDA communication with the driver
I scanned it, and there’s nothing even remotely close to any on-device cache management, which means either hardware or driver manages it. My guess would be that it is completely hardware-managed.

Anyway, you need to look into the driver itself it seems.

Edit: also see this If you haven’t already. It seems to suggest that icache is unified at some point with constant cache, but lacks any details. Maybe it will help you.

Topic		Replies	Views
Weird question: Which memory is code executed in ? CUDA Programming and Performance	3	4637	April 2, 2009
L1 Cache, L2 Cache and Shared memory in Fermi CUDA Programming and Performance	5	23604	March 21, 2011
app runs fine in Emu, but not on the card CUDA Programming and Performance	1	4127	April 25, 2007
Instruction Cache CUDA Programming and Performance	1	4623	January 19, 2012
At what memory is the GPU code located in execution, in host memory or in device memory? CUDA Programming and Performance	3	4735	July 10, 2010
Article by David Kantner mysteries unveiled CUDA Programming and Performance	5	2742	September 27, 2008
Cache data invalidation between kernel calls CUDA Programming and Performance	5	5624	August 22, 2013
Can device memory be flushed and reset? CUDA Programming and Performance	0	1646	July 9, 2009
cuda profiler, 0 instructions. CUDA Programming and Performance	1	1569	July 29, 2008
On-the-fly recompilation how to alter kernel after launch CUDA Programming and Performance	0	2387	September 30, 2008

Flushing Instruction Cache on GPU

Related topics