I used the cudnn documentation but I don’t understand how to set the function set needed for MultiHeadAttnForward() to work.

I took the code found on this topic “MultiHeadAttnForward Result” and I get similar results, does anyone have an example of functional code of this “layer” returning an array of non-zero values?


Cuda Version: 11.7
Graphics card: NVIDIA GeForce GTX 1050
CuDNN Version: 8.4.1