How to directly write some machine code and run it on GPU?

Hi all,

Working at windows, an executable file compiled from nvcc could be extracted to SASS and PTX. However the SASS code is only in text format, and is not able to be manually modified.

How could I directly write some code that is able to run on GPU?

or

How could I implement (*) in the example below?

  • I write a CUDA program, and compile it into prog.exe
  • I use cuobjdump to get prog.SASS.
  • prog.SASS includes 2 lines in kernel, which tell me the two lines of code will be executed.

    IADD R1, R2, R3
    IADD R3, R4, R5
  • By some means, I only modify prog.exe and remove the IADD R3,R4,R5 execution in this exe(maybe not in SASS, anyway I modify the content that will run eventually). (*)
  • Then I run prog.exe and get the result without the IADD R3,R4,R5.

Thank you!

I believe NVIDIA doesn’t publish the SASS instruction set or the instruction encodings. It all changes from architecture to architecture. I suppose you want to replace one instruction with a noop? I don’t know the encoding for noop, but I think there are several open source SASS assemblers out there. Here’s one for Fermi: https://code.google.com/p/asfermi/

(By the way, runtime patching of kernels is HARD, but possible.)