limits on number of textures?

Hi,

Are there limits on:

(1) The total number of texture references my program can have? I mean “texture<>” declarations, defined at file scope.

(2) The number of texture references a given kernel can use?

I seem to be encountering a very low limit (12 texture references total) on a GTX 285. Surprisingly, there is very little information about this in these forums or on the web in general, and what’s out there is inconsistent. I’ve heard everything from 4 to 32 to 512 to no limit.

Thanks in advance,
Jim

The hardware limit is 128 texture references per kernel. You should certainly be able to use more than 12, can you post some example code?

Simon,
Is there any plans of dynamically binding textures to kernels (currently the texture declaration always has to be in file-scope)

That would certainly make life easy, but without introducing a device side linker of some sort I am not sure how feasible it would be.

Hmm… How about loading CUBIN via driver API? Are Textures given a place in CUBIN? I am just so confused about the whole thing…How about multi-GPU case?? Hmm…

I think one of the key problems at the moment is the lack of function pointer/subroutine support in kernels. It seems like texture access winds up being translated into in-line assembler during compilation, and everything needed to make the texture thread launch happen needs to be available to the compiler in the same compilation object. If it weren’t inlined, then it might be possible to leave a dangling symbol and have the driver match up everything J.I.T at runtime, something like the way a modern shared library runtime linker works. That has side effects though - program launch times could be much longer than now, especially with complex applications, and then you have the new situation where a CUDA app that compiles without error doesn’t run and returns with a bunch of symbol or object errors. Which in many ways is a harder and more complex set of problems to debug than now. Also it adds additional functionality, complexity, and overhead to the driver which is already a larger and complex piece of code.

With the arrival of Fermi, it will be interesting to see how the tool chain develops, but as it is now I don’t see how it could be done.

Current hardware can’t dynamically index into an array of texture references (samplers). Note that it is possible to bind different arrays (textures) to your texture references at kernel launch time.

We are thinking about adding support for texture arrays, which let you dynamically index into an array of identically-sized images. Note that you can already do this today using 3D textures in CUDA, but there are size restrictions.

http://developer.download.nvidia.com/openg…xture_array.txt

I can’t say anything about Fermi graphics features, but DirectX 11 does include the ability to index into arrays of samplers.

Simon,

Thanks for the good explanation.

At this point, Can you tell me something about portability of CUBIN code that has kernels with texture references?

Say I create a library of cubins and use them with different CUDA programs. How do I go about handling texture references? Any inputs?

Can you say that Fermi is a DX11 card?

No. Fermi is the code-name for our next-generation CUDA architecture. We haven’t announced any products based on it.

Sorry to be so evasive!

No problem ;)

Although the fermi fun fact of the week is that the first graphics card based on Fermi is codenamed GF100 (http://twitter.com/NVIDIAGeForce)