I have started CUDA programming two weeks ago, mainly writing a mex file to speed up a MATLAB code. I have successfully processed my data using float, and now I want to return the result as uint16 to MATLAB. However, I can’t seem to get the type conversion right, and the result as seemed in MATLAB are all garbage.
My function is as follow:
[codebox]global void scale(unsigned int* odata, float* idata, float factor, unsigned int Size)
int i = blockIdx.x * blockDim.x + threadIdx.x; if(threadIdx.x < Size) odata[i] = __float2uint_rn(idata[i] / factor * 65535 /2);
And my host code is as follow:
[codebox] /* declare variables */
unsigned int* d_VTD; imageSize = LENGTH/2 * COL; /* Allocate memory */ cudaMalloc( (void**) &d_VTD, imageSize*sizeof(unsigned int)); out_dimensions = Step; out_dimensions = COL; plhs = mxCreateNumericArray(2, out_dimensions, mxUINT16_CLASS, mxREAL); unsigned char* h_VTD = (unsigned char*) mxGetPr(plhs); /* scale and cast */ numThreads = 512; numBlocks = COL; scale<<<numBlocks, numThreads>>>(d_VTD, d_Vtemp, factor, Step); int bytes_to_copy = Step * COL * mxGetElementSize(plhs); cudaMemcpy(h_VTD, d_VTD, bytes_to_copy, cudaMemcpyDeviceToHost);[/codebox]
d_Vtemp is an array of floats.
The MATLAB parts of the code is copied from the mex file programming guide.
I am not sure if I have used the right types for declaring the d_VTD.
Does anyone know what might the problem be? Or is there a better way to do type conversation/casting from float to uint16?
Just to give an example of how bad the result looks like, I have attached a jpg file containing two images. The top image is the correct image whereas the bottom is the jumbled image.
Any help will be much appreciated.