On-the-fly recompilation how to alter kernel after launch

I’ve created BarsWF application, but further improvements & features requires on-the-fly kernel modifications. I mean I need to read some parameters, and enable/disable some parts of kernel code. Having several kernels is not an option as there are hundreds of combinations.

Theoretically CUDA might be clever enough to perform simple optimizations based on constants values. I.e. when you set constants, and run kernel it may optimize it based on these constants BEFORE compiling kernel code into device-defendant kernel which is going to be executed on device.

Also, looks like I can generate .ptx code on the fly and use driver API to load cubin file. Can I bundle ptx compiller with my program?


That’d be the way to go.

Bundling the compiler would be a licensing issue. NVIDIA isn’t too clear on that. The EULA doesn’t say anything directly, I think. NVIDIA people have said you’re allowed to redistribute cudart.dll (which is necessary to run anything). I don’t think they’d really have a problem with you redistributing the compiler, but if you want to do it 100% legally you’d need written permission from NVIDIA, so contact them directly.

Re: optimizing based on constants. This is something I’ve always wanted to see languages do. But it’d take extra facilities to do that from bytecode. For now the way to do it (ie, unroll loops, mask out if statements) is by recompiling the C.

You could also contact the developer of ‘cudasm’ (a third-party PTX assembler) to see if he would license his code to your for use in your program.

You basically summed up why I am extremely interested in LLVM.

I’d briefly heard about LLVM. I didn’t realize it let you do this. Very cool.

You know, Apple has been doing some interesting fundamental work lately. It also happens to be behind OpenCL.

I am working on something similar to this as well…hopefully I’ll have something (at least a proof-of-concept) working within a couple of weeks.

LLVM itself is very interesting, but
does it support CUDA? Or is it planned?

Doesn’t support CUDA (except for that silly project I did in school which should be forgotten about by everyone), but there’s nothing stopping anyone from working on it.

As CUDA does not allows to write self-modifiable code, I don’t belive things like LLVM might be just ported to CUDA.

I am thinking about creating medium-level(non-C++, no objects, but with optimization) language which would compile into ptx code, for further compilation into cubin on the fly. Do you think it might be usefull for everyone?

You talk big.

LLVM doesn’t need self-modifying code. You’re not suggesting that a CUDA kernel acts as a LLVM virtual machine and JITs itself? No, you’d do it from the host side and it’d be possible.