Shader clock, core clock, memory clock

Hello everyone,

First of all I would like to say that I just started reading about GPUs, so I am far from expert! :)

I keep reading about GPUs that are clocked with 3 different clock domains; shader, core and memory clock. Ok, memory clock is responsible for data transfers between the GPU and the off-chip memory, right? But what about the other two clock domains? I know that a GPU consists of a few multiprocessors, each one including a few processors. So, if I understand right, when we say shader_clock, we mean the one that goes to all multiprocessors and clock_core the one that goes to each processor inside the multprocessors?

Can anyone elaborate on this topic?

Kind regards,

Actually, you have core and shader clock backwards. The core clock runs some functions on the multiprocessor level, like the instruction decoder, and the shader clock runs the individual processors. The shader clock is the fastest of the two, and this sets the speed of arithmetic operations by the processor.

As far as estimating the speed of a GPU, the core clock is not important.

Thank you very much for your reply Seibert! This was very helful! :thumbup:

What I don’t understand is why the shader clock is 2.5x the core clock. I thought all the stuff with half-warps meant the ALUs were exactly twice the clock of the dispatcher/sram/etc.

I am also a novice of cuda and gpu. I’m little confused of the memory clock. The new vesion Nvidia gpu card use the PCIE 2.0, which make the speed of data transfer twice. Does this speed means the transfer between host and device(gpu), and the memory bandwidth means the speed between global memory(device memory) and the shared memory(on-chip memory).

And why the shader_clock is twice of the core clock for certain gpu, and more than twice for others? :whistling:

PCIE version only affects host to device and device to host bandwidth. Everything you do inside a kernel is not affected by PCIE bandwidth at all.