Hello -

i try to play a bit with the cuasm compiler written by wumpus. To write a whole kernel
in ptx (assembler) format ist maybe feasible but really challenging.

It will be interessting if only the really critical part in the kernel is written in ptx. Something
like this :

// ----- host -----

CUmodule cuModule;
cuModuleLoad(&cuModule, “myFunctionFoo.cubin”);
CUfunction cuFunctionFoo;
cuModuleGetFunction(&cuFunction, cuModule, “foo”);

mykernel<<<gs, bs>>>();

// ---- device ------

global mykernel() {

// call the function foo


Is there a way how to do that ? I read something about the “fatbin” file format.
It merge different .cubins together to one file and then can be uploaded to the device using
the code above.

Thanks guys,