Question about gpu architecture (threads,blocks,SMs,cores)

I am trying to fully understand the gpu architecture and i would like you to help me!
I am using Jetson X1 with Maxwell Architecture. I saw my gpu includes 2 SMs. Each SM includes 128 cuda cores. So i have totally 256 cores. I would like to know what does it mean. In general i know that the SMs execute the thread blocks and each SM has a warp scheduler that executes a warp, but i don’t know how they do it.
For example:
I launch a kernel with dimGrid=1024 and dimBlock=1024. Suppose that there are no dependencies in the threads so there are not switches between threads in the cores. How many threads are running in the same time in any core? Can a core execute more than one thread in the same time, or a core is mapped to only one thread?
