I have started CUDA programming two weeks ago, mainly writing a mex file to speed up a MATLAB code. I have successfully processed my data using float, and now I want to return the result as uint16 to MATLAB. However, I can’t seem to get the type conversion right, and the result as seemed in MATLAB are all garbage.
My function is as follow:
[codebox]global void scale(unsigned int* odata, float* idata, float factor, unsigned int Size)
int i = blockIdx.x * blockDim.x + threadIdx.x;
if(threadIdx.x < Size)
odata[i] = __float2uint_rn(idata[i] / factor * 65535 /2);
And my host code is as follow:
[codebox] /* declare variables */
unsigned int* d_VTD;
imageSize = LENGTH/2 * COL;
/* Allocate memory */
cudaMalloc( (void**) &d_VTD, imageSize*sizeof(unsigned int));
out_dimensions[0] = Step;
out_dimensions[1] = COL;
plhs[2] = mxCreateNumericArray(2, out_dimensions, mxUINT16_CLASS, mxREAL);
unsigned char* h_VTD = (unsigned char*) mxGetPr(plhs[2]);
/* scale and cast */
numThreads = 512;
numBlocks = COL;
scale<<<numBlocks, numThreads>>>(d_VTD, d_Vtemp, factor, Step);
int bytes_to_copy = Step * COL * mxGetElementSize(plhs[2]);
cudaMemcpy(h_VTD, d_VTD, bytes_to_copy, cudaMemcpyDeviceToHost);[/codebox]
d_Vtemp is an array of floats.
The MATLAB parts of the code is copied from the mex file programming guide.
I am not sure if I have used the right types for declaring the d_VTD.
Does anyone know what might the problem be? Or is there a better way to do type conversation/casting from float to uint16?
Just to give an example of how bad the result looks like, I have attached a jpg file containing two images. The top image is the correct image whereas the bottom is the jumbled image.
Any help will be much appreciated.
Thank you,