In the CUDA runtime 4.0 RC2, it seems that Nvidia has changed the structure passed to cudaRegisterFatBinary(void fatCubin). This isn’t unexpected, but it’s very different from 3.x, so I’m having difficulty identifying information I need for extracting elf, ptx, etc. for emulation, disassembly, etc. It no longer seems to be a pointer to a __cudaFatCudaBinary structure. Casting the parameter to a (__cudaFatCudaBinary) yields a structure that is mostly empty, the magic number = 0x466243b1, the version = 0x00000001, and gpuInfoVersion now seems to be a pointer to the beginning of an important structure that does seem to contain the goodies. Anyone know what’s the new structure?
Unfortunately, it seems the structures are now different between Windows (which I work on), and Ubuntu (where Ocelot lives). So, the code in Ocelot for interpreting the new format (http://gpuocelot.googlecode.com/svn/trunk/ocelot/ocelot/cuda/implementation/FatBinaryContext.cpp ) does not work on Windows. Sigh.
The fat binary format on linux seems to be a list of binary objects with this header:
typedef struct __cudaFatCudaBinary2EntryRec {
unsigned int type;
unsigned int binary;
unsigned int binarySize;
unsigned int unknown2;
unsigned int kindOffset;
unsigned int unknown3;
unsigned int unknown4;
unsigned int unknown5;
unsigned int name;
unsigned int nameSize;
unsigned long long int unknown6;
unsigned long long int unknown7;
} __cudaFatCudaBinary2Entry;
‘binary’ is an offset from the base of the header to the actual binary. I have seen cubins stored in ELF format (you can copy them out and dump with with objdump), and PTX assembly files being types of binaries. On windows the format may be different, but the way I figured this out was finding a binary in a format that I understood, ELF (search for the magic word), then trying to find the header by comparing multiple applications to see where the binary starts, and finally filling in the header fields in the context of the binary that I understood. Strings are also easy to pick out, especially when they correspond to the name of the program you are running.
I really wish there would be some documentation for this…