Can you give any hints about my code ( given in the begining of this thread ) to improve performance?
Can any one has any idea please help I have the same problem.
Hi Manjunath Gudisi,
Actually what is the problem in my code is accessing the value of Value. If I replace this with shared array that will nothing just taking value from texture memory and discarding it. then I get time 20 ms which is improvement. But till now I am not sucess to use shared array in this kernel , one of the problem is here value is unsigned char.
Can any body say how to use shared memory in my kernel inplace of value ?