Hi
meybe silly question ;)
I’m working on 64kb intro and currently i’m using CUDA to generate procedural textures (in realtime)
and the whole renderer is olso in CUDA.
all is fine, speed is ok, but the size of cubin ‘bincode’ section is huge
(after compressing all with Crinkler the kernels size is 90kb )
I cannot compress soruce code in the final executable like for shaders, i need the compilled cubin
is there any way to reduce ‘bincode’ section size ?
My current compilation options for nvcc are -cubin -use_fast_math -arch sm_13 -code sm_13
so bincode section should only contain machinecode for GF 2xx series - right ?
I’m using driver api, and intro will be only for GF 2xx series (its realtime distance field rendering that need a looooooot of processing power to be really realtime
so i’m not care for other cards (right now) [and i could live witch 1 exec per card situation :)]
Thans for any sugestions.