CUDA vs CPU

Can anyone explain what this sentence means:
As a computation device, the GT200 is a multi-core chip organized in a two level hierarchy that focuses primarily on achieving high compute throughput on data parallel workloads by sacrificing single-thread performance and execution latency. This is in contrast to general purpose processors which focus primarily on single-thread performance and execution and communication latency, with a secondary focus on high compute throughput.

Also what exactly does the term “high compute” mean?

Thanks in advance

It basically means that GPUs like the GT200 forgo features present in modern CPUs like large, many way caches, out of order execution, branch prediction, etc. in favor many more cores and high performance on chip thread management.

Nothing. There is no term “high compute” in that text. The term is “compute throughput”, high is used as an adjective in this context.

High “compute throughput” means that the GT200 architecture is optimized to deliver a maximum number of results per GPU cycle, on optimized software, up to 240 floating-point or integer results per cycle (some results may be MADD, a multiplication followed by a multiplication), where actual 4-cores CPU are limited to 32 results per cycle in the best case.

And the term “throughput” means that the important metric is how many instructions per second are completed, not how long it takes a single instruction to finish. People are often surprised to hear that CUDA works best if you have 100 times more threads than processors. The idea is to hide the latency of memory and the instruction pipeline by keeping a large number of instructions in flight at any given time.

Thanks everyone for your answers!