simple memory questions


I readed the programming guide and parts of the bestpractiseguide but I still have some questions about the memory management of cuda:

  1. What is the maximum size of texture memory? It’s only limited by the size of totalglobalmem (DRAM), isn’t it?

  2. The texture cache size is 8 kb per multiprocessor?

  3. The existence of texture cache data is not bound to the lifetime of a block, but to the execution time of a kernel?

I hope you can help me.

Thanks in advance!

There should be no limit - its a logical binding. However I found that after 1.5-2GB I get faulty results.

I opened a pending bug report for nVidia - (582591) - Problem accessing textures bound to huge arrays (>2GB)

yes as far I as I remember

Its there as long as your context lives and you dont call cudaUnbind on that texture.

You can copy new values into the bounded device memory (if you want) and the texture will reflect those new values.

Hope that helped…


Thank you very much! Yes, that helps a lot! :thumbup: And I need a bit more help. :">

  1. I don’t understand the meaning of cudaDeviceProb.textureAlignment? There are no informations in the guides. In my case it’s 256.

Some about shared memory:

  1. How can I calculate the bank size? Is it always 32 bit? I would like to calculate it in this way:

shared float array[32]; --> banksize: 32/16 float = 2 float --> 64 bit (16 = number of banks)


shared char array[32]; --> banksize: 32/16 char = 2 char --> 16 bit (16 = number of banks)

  1. There is (in the best case) one shared memory request per half warp. In the programming guide stands:


As a consequence, there can be no bank conflict between a thread belonging to the first

half of a warp and a thread belonging to the second half of the same warp.[\quote]

That means requests of different warps will always be serialized, or not?

Thanks in advance again!

ok, I think the bank size always is 32 bit. But the number of banks is constant, too. It’s 16! How can exist an shared memory array like this?

shared float array[32];

float --> 32 bit --> the array must have 32 banks, right?

Thanks in advance!