Please, how to use cuda core and tensor core simultaneously? If it is convenient, can you give an example with CUDA code? Thank you!
There’s no reason to attempt this. The share some resources making it highly unlikely to run in parallel.
Thank you for your reply! You are right.
By the way, do you know the current application of cutensor? I feel seldom mentioned. Thank you.
This idea was originally proposed because I thought that CUDA core in the same block would be idle during the execution of tensor core.
However, can CUDA core and tensor core be implemented in parallel by means of rearrange-pipeline?