I found an article over at RealWorldTech which I have not yet seen at Nvidias own listngs. Amongst other things it compares the older 1.1 compute model to the newer 1.3 model, with pretty pictures and everything,
It lists as references quite a few prominent people from this forum, but I doubt the thruthiness of this sentence:
Instructions? … Me believes that must be a typo, no?
I can’t find anything that agrees with that, but it does raise a question that’s been on my mind: Where does the code reside? Constant memory seems as good a place as any and it certainly works in a way that would be more optimal than the other areas, for code execution.
as far as I ever understood from reading these forums, it resides in device memory like all other memory. There is a instruction cache in each SM. There is as far as I remember also a limit to the size of a kernel (#instructions) I think that is also mentioned in the prog. guide
Constant memory also resides in device memory, is limited to 64k, and there is a cache in each SM (of 16k)
Oops. This is interesting and it makes sense why big “FOR” loops (like encompassing a whole big fat kernel) can cause latencies – the instructions might not be in the SM cache and hence would result in cache-miss… Juss my guess based on the discussion going on here.