how cuda core works?

My project is to research the hardware on GPU, so I want to know how cuda core works ( I read many papers, but I am also confused about this problem)
My GPU is GTX960M, GPU is GM107, it has 5 SM and each one has 128 cuda cores . I see it is a Maxwell structure, In one SM, 128 cude cores are divdided into 4groups and each group has their own warp scheduler
my questions are:

  1. is one cuda core execute one thread, so 32 cuda core execute 1 warp? one warp scheuler can execute one warp at a time, so in my gpu, one SM can execute 4 warps at a time? the cuda cores work in maxwell structure?

I am so confused,thank you very much

ndidia gpus are 32-wide simd processors. every SM include 4 cores sharing some resources. Each core includes 32 ALUs, so core can perform one 32-wide SIMD operation every cycle. nvidia and amd calls ALUs cores in order to fool customers. hope that once you know that, books will start to make sense :)

so 32 ALUs = 32 sp? is it right?




thankj you for your help