I don’t know if “core” is its name but I will call it core as in the CPU. I believe the GPU has hundreds if not thousands of them. Each one is able to execute a thread.
My doubt is:
In the GPU, are cores multi thread? obviously they will execute only one thread at the time, but I was wondering if it is possible to assign more than one thread to a core, maybe different streams can share the same cores.
My doubt arouse when I read this fragment:
“A few characteristics of CUDA programming model are very different from CPU based parallel programming model. One difference is that there is very little overhead creating GPU threads. In addition to fast thread creation, context switches, where threads change from active to inactive and vice versa are very fast compared to CPU threads. The reason context switching is essentially instantaneous on GPU, is that the GPU does not have to store state”
Why the GPU does not have to store state? how does it remember its variables when it return to a thread?