I’m trying to copy the data from a OpenCV IPLimage which is formatted to unsigned 8bit characters. the imagedata is a 1D array of characters that i pass into a resize function in cuda.
I have a setup where i convert directly into a 1D float array and copy that float array into the gpu and this works perfectly. The problem is I’m working with extremely large pictures that will cause the CPU to run out of memory if i try to expand the character array into a float array before copying to the gpu. But if i copy the character array directly into the gpu, the resulting image is skewed with it’s right edge running diagnal through the image. I figure i’m missing something so small because i’m so close to it working correctly and getting rid of this size limitation i have.
The resize kernel is pretty much the sample image resize code :P The actual code is shown below, where the rows and columns of the original large image are passed and the float array on the gpu that i want to store my resulting small image is g_odata.
The resulting skewed images look like this… what the hell is wrong? beats me…
[attachment=5639:attachment]
texture<char, 2, cudaReadModeNormalizedFloat> tex;
__global__ void
resizeKernel( float* g_odata, int cols, int rows)
{
//uses cuda's internal interpolation, change texture's tex.normalized to true for use.
unsigned int x = blockIdx.x*blockDim.x + threadIdx.x;
unsigned int y = blockIdx.y*blockDim.y + threadIdx.y;
float u = (float)x / (float) cols;
float v = (float)y / (float) rows;
g_odata[y*cols + x] = tex2D(tex, u, v);
}
The texture Desc is setup as either one of these ways with no changes in the output.
cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc(8, 0, 0, 0, cudaChannelFormatKindFloat);
or
cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc<unsigned char>();
In the .cu file i set the texture up as such. Changeing the address mode between clamp and wrap does not change the resized output image. Neither does the filtermode. Normalized needs to be true because of the kernel setup.
tex.addressMode[0] = cudaAddressModeWrap;
tex.addressMode[1] = cudaAddressModeWrap;
tex.filterMode = cudaFilterModeLinear;
tex.normalized = 1;
Hostimage is the 1D char array and i copy it to the GPU with this code. width and height are the width and height values returned from the opencv wrapper.
size = height*width*sizeof(char);
CUDA_SAFE_CALL( cudaMallocArray( &cu_array, &tex.channelDesc, width, height ));
CUDA_SAFE_CALL( cudaMemcpyToArray( cu_array, 0, 0, hostimage, size, cudaMemcpyHostToDevice));
CUDA_SAFE_CALL( cudaBindTextureToArray( tex, cu_array, channelDesc));
dim3 dimBlock(16, 6, 1);
dim3 dimGrid(iDivUp(r_width,dimBlock.x),iDivUp(r_height,dimBlock.y), 1);
resizeKernel<<< dimGrid, dimBlock, 0 >>>( d_data, r_width, r_height);
CUDA_SAFE_CALL(cudaMemcpy(r_image,d_data, r_size, cudaMemcpyDeviceToHost) );
and that’s about it, but i have no idea what explains the skewing and i’m sure it’s super simple. If anyone could help me with this it’d be really appreciated since i can’t find much about the opencv char to gpu texture and letting it convert nicely beside you should do it :| .
white_out.bmp (85.1 KB)