Local Memory Per Thread ?

Hello Everyone,

  1. The programming guide 3.0 mentions in section G.1 about “local memory per thread” to be 16KB.

My question is why the local memory is per thread ?

How can local memory be dependent on no. of threads ?
Amount of local memory must be fixed. So it should be something like local memory per SM or per SP.

  1. The register memory is mentioned as " No of 32-bit registers per SM " which is 16 K.
    Since Register memory is local to each SP, 2 K registers are available per SP.

So when a thread executes on an SP, the memory available to it is 2k . Is that correct?

  1. Another question I have is, whether there is one local memory per SM OR one local memory per SP like the register file.

Thanks and Regards

Local memory is dynamically allocated in global memory (DRAM). It is not on-chip memory.

Registers are not local to each SP. Each active thread on a SM is statically allocated the registers it requires. (nvcc --ptx-options=-v will show you how many registers per thread your kernel uses)

Since you should be running way more threads per block than there are SPs, the register storage available to each thread is much less than 2 kB.

I got that right now !

Actually I was confused with a figure 4.2 given in programming guide version 2.3.1. The figure shows one register file per SP.

The figure has now been removed from V 3.0. A new figure is also present in white paper which depicts what you are saying.

Thanks

Thanks Gregory!

I am just trying to understand this !

Do you mean to say that there is no separate “local memory” DRAM?

How about texture memory and constant memory ? I know that they are on DRAM!

But are they on separate DRAMs ? OR

There is just one DRAM of 4GB (on say Tesla C1060) which implements global, local, texture and constant memories?

such that,at any given time, (Global+Local+texture+constant) Memory = 4GB.

Yes

No. There is at least also instruction memory and (if the card is used for video) the video buffer. Memory might be used for other purposes as well (uploadable firmware etc.). Only Nvidia knows, I guess.