How many threads can be running on a cuda core?

I’ve run a simple test cuda program with <<<256,256>>> execution configuration on GTX1080 ( 20 SM, 128 core per SM)

and Nsight shows that 160/256 blocks are “running”, which means 160256 threads are running on 12820 cuda cores.

The spec sheet shows that the max number of threads per one SM 2048. so the result is correct as 204820 = 160256.

So, I can guess, 2048/128 = 20 threads are “running” on one cuda core and I think “running” means are actually “active or occupy a cuda core”.

am I correct??

This question is answered in many places. There really is no sensible concept of threads running on a CUDA core.