Access in kernel function memory in parallel or in sequential?

Hi people, I want to know if, in a global function runs by kernel<<<dimBlocks,dimThreads>>>(…);, the access in memory occurs in parallel or in sequential (maybe for the cause of bus???). Example:

__global__ void SumVect(...){





The threads works in parallel or in sequential because they must accede in the same global memory?

Most of the time access is parallel. It is serial on a compute capability 1.0 or 1.1 card if the accesses are not to a suitably aligned consecutive block of memory, or on any card if the addresses are to far away from each other to be accesses in a single transaction.

Check appendices F.3.2 and F.4.2 of the Programming Guide.