Im doing some analysis for a paper im writing and ive stumbled upon a question ive never asked myself.
Assuming non coalesced memory reads (which is my case) are all memory fetches still AT LEAST 64 bytes (for compute capability 1.1)
As per the programming guide:
So, im wondering.
Im trying to figure out if im bandwidth limited… Well, i know that i am, but im trying to put it in words.
If i use 64 bytes as the smallest possible memory transaction, i actually get a “higher than theoritical” bandwidth figure, which is of course wrong on my part.
So, does anyone know?
Well, now i see the memory bus (on my 8800gt) is 256 bits. So why would 32bytes reads not be supported on 1.1?