Apologies if this has been asked before, and for its complete simpleness.
i have started to read the CUDA programmers guide, and have some very basic questions :
Section 2.1 describes kernels - which seem to be basic C functions. When this code on swection 2.1 is invoked rather than N iterations, there will be 1 (one) iteration multiplied by N to achieve a result ina pporx 1/N th amount of time ?.
Section 2.1 states that a calculation is divided in N threads which i would assume is as in 1 (one) above N calculations being processed at the same time. In the code example, the “threadIdx” is declared, and in section 2.2 “threadIdx” are arranged into blocks. Is it correct that a block is just the collection of threads ?, and you do not have to worry how they system implements these blocks when programming ?
Section 2.2 then continues to state that 1, 2 and 3 dimensional blocks have thread indexes ID calculated using the rules as per the text. Is it important to determine the thread index ID and would it be necessary to use this programming - basically is it rarely used, or something that will be used often ?.
Apologies for the basic and perhaps unusual questions - i just want to make sure that i remember the important areas before reading the document further. thanks.