Use cuda core & tensor core at the same time

Can cuda core and tensor core operate at the same time? For example, if my thread block have 8 warps and I write 2 strategies for them. For warp0, it will use cuda core to calculate sth. For the rest 7 warps, they will all use tensor core for some calculation. Can these warps operate in parallel?

For example, if I need to calculate matrix multiplication of 34 * 32 * 32 and I use wmma to calculate 32 * 32 * 32, cuda core for the rest 2 * 32 * 32. Will it faster than by just padding the matrix and then only use wmma?

If you spend one hour of your time to set up a quick experiment, you will generate an authoritative answer to this question.

Typically yes, you can run them at the same time, each SM Partition has its own tensor core unit and normal cuda cores, so it matters, which warps use one or the other that the workload is distributed well.

Here are some related threads: 1 2 3 4

Thanks!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.