Trying to make a templated demosaic resampler for all four RGGB configurations (format) that can utilize different resampler methods (class O, op()). Below is front matter. The compiler tells me on the last line below (tile[0][shridx]=…; lot’s more code that isn’t included) "Can’t tell what pointer points to, assuming global memory space (which is wrong). I would have thought that line 22 should make it pretty clear (later it gives me the same warnings when trying to use the other pointers…).
I’ve gotten this error before from bad syntax, but I don’t understand the issue here. I know I can use pointers to shared memory (done it before), and know I can do local fixed arrays. Can I not do both simultaneously?
template<class O>
__global__ void demosaic( unsigned int *RGBA, const size_t RGBApitch, const float *src, const size_t src_pitch,
float *Y, const size_t Ypitch, ushort2 *UV, const size_t UVpitch,
const uint2 imgsize, const unsigned int format, float normscale, O op )
{
// Shared memory for color planes
extern __shared__ float shrdata[];
// Number of horizontal color samples/tile
unsigned int shrwidth = blockDim.x+4;
const int ix = UMAD(blockIdx.x,blockDim.x,threadIdx.x);
int iy = INT_DOUBLE(UMAD(blockIdx.y,blockDim.y,threadIdx.y));
float2 idata, *f2ptr;
float *tile[4];
// Offset into each shared memory color plane
unsigned int temp = threadIdx.x+2;
unsigned int shridx = UMAD(2+threadIdx.y,shrwidth,temp);
// Stride (in samples) per color plane
unsigned int blocksize = UMUL(shrwidth,threadIdx.y+4);
// Pointers to the four color planes (UL, UR, LL, LR)
tile[0] = &shrdata[0];
tile[1] = &tile[0][blocksize];
tile[2] = &tile[1][blocksize];
tile[3] = &tile[2][blocksize];
//////////////////////////////////
// Load pairs of 'even' row pixels
//////////////////////////////////
// Get pointer to input image
f2ptr=(float2*)((char*)src + UMUL(iy,src_pitch));
// Are we in the valid range?
if (ix<imgsize.x) {
// Load middle pixel values in pairs
idata = f2ptr[ix];
// Save UL pixel to shared memory
tile[0][shridx]=idata.x;
Also tried
tile[0] = shrdata;
tile[1] = &shrdata[blocksize];
to be more explicit, but this didn’t change anything.
Googled ‘array of pointers shared memory cuda’, but kept getting links to partitioning the allocation of dynamically sized shared memory into different sections, which is obviously what I’m trying to do here (not even different types), but I can’t seem to make an array of pointers to these color planes within the shared memory.
If I make these four local variables (not an array), the compiler seems happy…
float *ULtile, *URtile, *LLtile, *LRtile;
...
ULtile = shrdata;
URtile = &ULtile[blocksize];
...
ULtile[shridx]=idata.x;
…but don’t understand why I should have to do this.
The idea is that I’ll use a variable for the first index to get to the correct pointer, but I’d have to use an if/then block to choose the correct pointer using the method just described.