Is there an easy way to estimate kernel size? I did “export keep=1” and got the *.ptx and *.cubin files. I can probably count the lines of code but I’m not sure that’s going to tell me as much as the size of the *.cubin file. If I don’t have debugging turned on, is it reasonably safe to estimate the kernel size as a direct function of the cubin file size?
The limit is 2 million ptx instructions. I want to know how big my kernel is getting as I add more function subroutines. Is there a way to estimate it based on the number of lines of code or the cubin size in bytes or what?
For the future, here is a reference point: I have a kernel which contins 995,368 lines of ptx assembly. After ~20 hours it compiled to a cubin size of 9,022,304 bytes. This kernel loaded and ran. I got a segfault trying to read back the results, but that is another story.
Waiting over a weekend is ok to test a point, but I really need to chop things up for better turn around! But 2 million ptx instructions is pretty damn big.
Just a personal experience, the 3.0 Beta version CUDA Tools compiles much faster than 2.3 (under Ubuntu 9.10 X64 , gcc 4.3), the same more-less-complex kernel takes 10 minutes under Cuda 2.3 and around only 20-30s in Cuda 3.0. And also the ptx process under 2.3 ate ever available byte (8 Gb) in the main memory during compilation which made the station unusable during compilation.
Has anyone else seen this problem??
In the other hand, both kernels worked as expected, but I never checked if they had the same size.