All right, so I managed to figure out a way to mostly abstract texture fetches. I know that it works and I have tested it extensively.
The method that I use to do this revolves around device function pointers, cudaMemcpyFromSymbol, and a bunch of texture references and their fetch functions declared in a header file.
The top level of the code where the texture references and the fetch functions are declared looks like this:
int nextTex = 0;
texture<float,cudaTextureType2D,cudaReadModeElementType> texref0;
texture<float,cudaTextureType2D,cudaReadModeElementType> texref1;
texture<float,cudaTextureType2D,cudaReadModeElementType> texref2;
texture<float,cudaTextureType2D,cudaReadModeElementType> texref3;
texture<float,cudaTextureType2D,cudaReadModeElementType> texref4;
texture<float,cudaTextureType2D,cudaReadModeElementType> texref5;
texture<float,cudaTextureType2D,cudaReadModeElementType> texref6;
typedef float (*pt2Func)(float,float);
__device__
float fetchtexref0(float x,float y)
{
return tex2D(texref0,x,y);
}
__device__
float fetchtexref1(float x,float y)
{
return tex2D(texref1,x,y);
}
__device__
float fetchtexref2(float x,float y)
{
return tex2D(texref2,x,y);
}
__device__
float fetchtexref3(float x,float y)
{
return tex2D(texref3,x,y);
}
__device__
float fetchtexref4(float x,float y)
{
return tex2D(texref4,x,y);
}
__device__
float fetchtexref5(float x,float y)
{
return tex2D(texref5,x,y);
}
__device__
float fetchtexref6(float x,float y)
{
return tex2D(texref6,x,y);
}
__device__ pt2Func fetchPointer0 = &fetchtexref0;
__device__ pt2Func fetchPointer1 = &fetchtexref1;
__device__ pt2Func fetchPointer2 = &fetchtexref2;
__device__ pt2Func fetchPointer3 = &fetchtexref3;
__device__ pt2Func fetchPointer4 = &fetchtexref4;
__device__ pt2Func fetchPointer5 = &fetchtexref5;
__device__ pt2Func fetchPointer6 = &fetchtexref6;
A ‘grid’ object can be created and used to manage accesses to a particular texture reference simply by assigning a function pointer as one of its members. Texture fetches can be easily carried out by using the associated function pointer to perform the texture fetch. The naming structure and copy to/from symbol makes setting these things up pretty easy. Every time a new ‘grid’ object is allocated it gets uses the ‘nextTex’ global variable to figure out the number of the next available texture reference. Once it binds that texture to its own cudaArray it increments the ‘nextTex’ counter so that the next ‘grid’ object will take the next reference in line. Its a bit clunky, but a lot easier to toss around than trying to manage about 50+ texture references that I am using in my main code.
This is how a ‘grid’ object is assigned a texture reference and how it binds its data to that reference:
__host__
void fill2D(cudaMatrixf data_in) // cudaMatrixf is just a front end for 3D device arrays
{
int nx = 1;
int ny = 1;
char* texrefstring = (char*)malloc(sizeof(char)*25);
char* texfetchstring = (char*)malloc(sizeof(char)*25);
int itemp = nextTex;
nextTex++;
sprintf(texrefstring,"texref%i",itemp);
printf("%s \n",texrefstring); // This line was just for debugging
sprintf(texfetchstring,"fetchPointer%i",itemp);
printf("%s \n",texfetchstring); // This line was just for debugging
symbol = texrefstring;
CUDA_SAFE_CALL(cudaMemcpyFromSymbol(&pt2Function,texfetchstring,sizeof(pt2Func)));
nx = griddims[0];
ny = griddims[1];
cudaError status;
cudaExtent extent = make_cudaExtent(nx,ny,0);
cudaMemcpy3DParms params = {0};
params.kind = cudaMemcpyDeviceToDevice;
cudaChannelFormatDesc desc = cudaCreateChannelDesc<float>();
status = cudaMalloc3DArray(&cuArray,&desc,extent);
params.srcPtr = data_in.getptr();
params.dstArray = cuArray;
params.extent = make_cudaExtent(nx,ny,1);
cudaDeviceSynchronize();
CUDA_SAFE_CALL(cudaMemcpy3D(¶ms));
cudaDeviceSynchronize();
const textureReference* texRefPtr;
cudaGetTextureReference(&texRefPtr, symbol);
printf("%s \n",symbol);
cudaChannelFormatDesc channelDesc;
cudaGetChannelDesc(&channelDesc, cuArray);
CUDA_SAFE_CALL(cudaBindTextureToArray(texRefPtr, cuArray, &channelDesc));
}
I’ve checked this code pretty extensively and it works fine when the declaration, assignment, and access all take place within the same header file chain, but I’m not sure what will happen when I assign in one file, and then access the data in a different linked file.
Now, on to my main question: If I setup my ‘grids’ in one .cu file, setup_fields.cu , store them all as an external variable, and then launch a separate .cu file,simpleNubeam.cu, that does a bunch of work using the ‘grids’ setup in setup_fields, will the function pointers still point to my original texture fetch functions? Both of the .cu files use the same header files, and contain the texture reference declarations. Will the compiler generate 2 sets of texture references, one for each .cu file, and hence my function pointers will point to fetches of the wrong textures, or will it all be okay? Note both of these .cu files are being combined into a single library.
I’m thinking that I need to declare all of the texture references as ‘extern’ and possibly all of the fetch functions as well. I’m just not entirely sure how the compiler will handle the declaration of the references in two separate instances that are going to be combined into a single library.
If anyone can straight up tell me what will happen that would be great. If not I’m going to try to look at the ptx output and try to determine if it is giving me the same functions for each of the texture fetch functions. I’ll also just check what happens when I access it from my second, linked file.
Either way I’ll post my results here. If you would like a copy of the full source code just let me know and I’ll upload a copy of it.