On-the-fly recompilation how to alter kernel after launch

BarsMonster · September 30, 2008, 4:16am

I’ve created BarsWF application, but further improvements & features requires on-the-fly kernel modifications. I mean I need to read some parameters, and enable/disable some parts of kernel code. Having several kernels is not an option as there are hundreds of combinations.

Theoretically CUDA might be clever enough to perform simple optimizations based on constants values. I.e. when you set constants, and run kernel it may optimize it based on these constants BEFORE compiling kernel code into device-defendant kernel which is going to be executed on device.

Also, looks like I can generate .ptx code on the fly and use driver API to load cubin file. Can I bundle ptx compiller with my program?

Thoughts?

alex_dubinsky · September 30, 2008, 7:28pm

That’d be the way to go.

Bundling the compiler would be a licensing issue. NVIDIA isn’t too clear on that. The EULA doesn’t say anything directly, I think. NVIDIA people have said you’re allowed to redistribute cudart.dll (which is necessary to run anything). I don’t think they’d really have a problem with you redistributing the compiler, but if you want to do it 100% legally you’d need written permission from NVIDIA, so contact them directly.

Re: optimizing based on constants. This is something I’ve always wanted to see languages do. But it’d take extra facilities to do that from bytecode. For now the way to do it (ie, unroll loops, mask out if statements) is by recompiling the C.

jack · September 30, 2008, 10:46pm

You could also contact the developer of ‘cudasm’ (a third-party PTX assembler) to see if he would license his code to your for use in your program.

tmurray · September 30, 2008, 11:37pm

You basically summed up why I am extremely interested in LLVM.

alex_dubinsky · October 1, 2008, 5:37pm

I’d briefly heard about LLVM. I didn’t realize it let you do this. Very cool.

You know, Apple has been doing some interesting fundamental work lately. It also happens to be behind OpenCL.

jack · October 1, 2008, 6:20pm

I am working on something similar to this as well…hopefully I’ll have something (at least a proof-of-concept) working within a couple of weeks.

BarsMonster · October 1, 2008, 6:54pm

LLVM itself is very interesting, but
does it support CUDA? Or is it planned?

tmurray · October 1, 2008, 6:59pm

Doesn’t support CUDA (except for that silly project I did in school which should be forgotten about by everyone), but there’s nothing stopping anyone from working on it.

BarsMonster · October 1, 2008, 10:23pm

As CUDA does not allows to write self-modifiable code, I don’t belive things like LLVM might be just ported to CUDA.

I am thinking about creating medium-level(non-C++, no objects, but with optimization) language which would compile into ptx code, for further compilation into cubin on the fly. Do you think it might be usefull for everyone?

alex_dubinsky · October 3, 2008, 1:01am

You talk big.

LLVM doesn’t need self-modifying code. You’re not suggesting that a CUDA kernel acts as a LLVM virtual machine and JITs itself? No, you’d do it from the host side and it’d be possible.

Topic		Replies	Views
self modifying code CUDA Programming and Performance	6	11259	April 16, 2008
On-the-fly recompilation how to alter kernel after launch CUDA Programming and Performance	0	2409	September 30, 2008
On-the-fly compilation CUDA Programming and Performance	9	1911	September 26, 2013
JIT .cu CUDA Programming and Performance	17	8294	October 13, 2010
Cuda ( 4.1 or future), LLVM and linking CUDA Programming and Performance	0	6544	November 18, 2011
Generate CUDA at run-time ? CUDA Programming and Performance	13	3236	September 28, 2011
Building Cuda Code with Clang CUDA Programming and Performance	4	6355	March 30, 2013
smart ideas for an interesting problem CUDA Programming and Performance	21	9746	December 10, 2008
NVCC at Runtime - End User Friendly Configuration Compiling GPU code without requiring Visual Studio CUDA Programming and Performance	16	10651	June 19, 2009
How to extract nested loop features from CUDA kernels for LLM-based optimization? CUDA NVCC Compiler kernel , nvcc , llm	0	64	November 25, 2025

On-the-fly recompilation how to alter kernel after launch

Related topics