Question about Jetson X1 architecture

Hello everybody!
I am trying to fully understand the architecture of Jetson x1 and i would like you to help me!
Jetson X1 has Maxwell Architecture. I saw my gpu includes 2 SMs. Each SM includes 128 cuda cores. So i have totally 256 cores. I would like to know what does it mean. In general i know that the SMs execute the thread blocks and each SM has a warp scheduler that executes a warp, but i don’t know how they do it.
For example:
I launch a kernel with dimGrid=1024 and dimBlock=1024. Suppose that there are no dependencies in the threads so there are not switches between threads in the cores. How many threads are running in the same time in any core? Can a core execute more than one thread in the same time, or a core is mapped to only one thread?
Thank you in advance!


You can find Maxwell architecture information on this page: