Hey,
I am currently doing a color conversion on images with Cuda and then trying to render them to texture using surface memory. I have to use Surface Reference API, because my Card is only capable of cc2.0.
My Kernel looks something like this:
surface<uchar1, 1> outputTex;
__global__ void NV12toRGB(unsigned char* nv12)
{
// Nothing interesting here
}
void callDecode(int width, int height, unsigned char* nv12, cudaArray_t texArray)
{
cudaBindSurfaceToArray(outputTex, texArray);
NV12toRGB<<<width, height>>>(nv12);
cudaFreeArray(texArray);
cudaDeviceSynchronize();
}
It is called from an other File, where in each iteration the resource is mapped:
cudaGraphicsMapResources(1, &cudaTex);
{
cudaArray_t cudaTexData;
cudaGraphicsSubResourceGetMappedArray(&cudaTexData, cudaTex, 0, 0);
callDecode(m_width, m_height, (unsigned char*)decodedFrame[active], cudaTexData);
}
cudaGraphicsUnmapResources(1, &cudaTex);
I am getting the following error when trying to compile the code:
1>ptxas : fatal error : Ptx assembly aborted due to errors
I know that it has something to do with the declaration of surface<void,1> outputTex. If i skip this, everything works fine.
Any conclusions?