multiGPU and texture references


I’ve code which uses some texture references. This code will be executed by (up to) 4 GPUs but has to run time delayed (so they are asynchronous to each other)

My problem is that every Thread uses the same texture references but the references have to be global and they are not allowed to be arrays.

I saw there are functions for creating a new context. A context is - according to the programing manual - like processes for the CPU with distinct 32bit address spaces.

But how does this help? I output the adress of my

but the address is always the same in all of the threads.

Am I doing something wrong? Or didn’t I get it how to use a context correctly?

Additional: Is the Cuda FFT-lib capable of being multi threaded?

Currently I’m working with a quite complicated solution which forks the main application and synchronizes the processes via Linux shared-memory.

Any hints?



Just let you know that I found a solution I can live with …

First of all I got rid of my multi-process implementation and moved to a multi-thread implementation. The issue (global and non-array texture-references) couldn’t be solved because the Cuda Programming Manual says that modules and texture references don’t have distinct address spaces in different contexts.

The way that works quite well is to lock functions by posix mutexes which use texture references. Inside the lock cudaBindTexture and cudaUnbindTexture is used. So a texture reference can only be used by one thread at a time and there are no conflicts.

That’s all … sorry if this is so unspectacular :-)


Interesting… what problems did you have using textures on multiple GPUs?

I use texture references across multiple GPUs now without thread synchronization around texture binds and unbinds. Each thread calls cudaSetDevice, allocates it’s own cudaArray, and binds the array to the texture reference. Each thread then calls kernels, runs other CPU processing, and finally unbinds the texture and exits.

I haven’t noticed any problems with this approach, but I’m still in the process of testing the numerical output of my CUDA implementation against our existing code.

Actually I never had any problems because I think it was clear that it wouldn’t work and I didn’t try it at all :-)

Maybe I’m wrong … but how is it possible to bind different arrays to the same texture reference?

Moreover the manual says:“Besides objects such as modules and texture references, each context has its own distinct 32-bit address space.”

Can someone confirm or correct me, please?


I’ve tested that it works correctly that you can bind different arrays to same texture reference in different host threads for gpus. The texture reference is a host data and I think it is more like just an ID (like opengl’s texture reference) used in context although all contexts use the same ID, they actually can represent different arrays in different contexts.