For Jetson TK1 i have basic doubt regarding how many blocks and threads can i spawn with Cuda 6.5 ??
I am working on the image size of 1280*1024 and i am mentioning threadsperblock as (32,32) and and blocks as (40,32) (this is calculated as image_width-1280/32 and image_height-1024/32)now if i launch kernel as
Kernel <<blocks ,threadsperblock >> (Arguments to be passes to kernel)
My program hangs and doesn’t give any output…
Now if i change block size as (5,5) it runs successfully but processes only a part of image…so by block size as a (40,32) am i exceeding the size of blocks ?
Or what should be proper tuning of blocks and threads ?
Any input is highly appreciated…