Hello,
I have tried to do the following and I get an invalid configuration error.
dim3 grid( 5, 1, 1);
dim3 threads( 10, 128, 1);
while this works fine
dim3 grid( 5, 1, 1);
dim3 threads( 10, 1, 1);
Why does the top not work? Is the answer dependant on the size of the kernel I run? The GPU docs say I need to try and make as many threads as I can to get the best perfomance, but this is not launching!