Hello all ,
I am new to GPU programming . i have some questions regarding basic concepts cuda and gpu hardware
If we assign threads , in runtime these threads are again divided into warps . I am quite confused about how gpu execute instructions . Is that the thread executes one instruction or a group of threads(warps ) execute one instruction ?? And how cuda cores involve with the process when executing an instructions.
And i am using Jetson TK1 , i have read that it has only one SM. so how many blocks does that SM have?
Thank you