Occupancy estimation... ..at runtime?

Buanderie · May 12, 2009, 5:58pm

Hello…

Working on my own project,
I was wondering… Would it make any sense to code the occupancy estimation formula from the XLS to search for the best block sizes at runtime, before running my kernel ?
I’m currently using the occupancy calculator, but this question just popped out.

kristleifur · May 12, 2009, 6:05pm

Hm, yeah, at first glance it sounds like a good idea. That particular formula probably isn’t the 100% best arbiter of block sizes, but something along those lines oughtta work.

Buanderie · May 12, 2009, 6:20pm

Ok… Then I’ll get to it now.

theMarix · May 13, 2009, 6:46am

Major problem for that is, that you need your register usage as an input, which you usually only get at compile time, but you would probably want it as a constant in your code. Now, one could put it in some resource file generated during the build process and read at runtime, but that’s kind of ugly. Is there any way to get the register requirements of a kernel at runtime?

tmurray · May 13, 2009, 6:54am

you can get register usage, etc. via API calls as of 2.2. I forget what they are, but it’s in the execution configuration section of the reference manual, I think.

theMarix · May 13, 2009, 7:23am

Thanks, that was fast. I got to admit, I only crossred the guide for changes and that one I obviously overlooked, though I have been waiting for it for quite a while. ;)

To anybody else looking for it: It is contained in the structure cudaFuncAttributes which you can recieve via cudaFuncGetAttributes and is explained in section 3.7 Execution Control.