I’m a 17 year old, high school student with an interest in mathematics, software and electronics.
I’m attempting to write a GPU equivalent of the “P7Viterbi” code from “HMMER2”, a computational genomics package for a school project (OK, so I am geekier than most students my age).
I read an article in the “Microprocessor Report” that very clearly described CUDA and the architecture of the 8800 series GPUs. However, I’m a bit confused about the usefulness of having more threads than thread processors.
If the GPU has only 128 thread processors, why have more than 128 threads?
If my understanding is correct, wouldn’t the additional threads just be waiting for an available thread processor?
I know why this is useful for traditional OS threads blocked on I/O. However, even if the time to switch threads is zero, I still don’t understand why have more threads than thread processors makes any sense.
Does having a pool of ready but waiting threads mask some memory latencies?
Hopefully, someone will clear up my confusion about threading. Be gentle, I’m not an aged and experienced boffin, after all!