Incomplete documentation of cudnnMultiHeadAttnForward

anon37147145 · June 14, 2019, 11:30am

Hi,

I’m interested in using cudnnMultiHeadAttnForward for inference but I find the documentation lacking.

Unless I missed it somewhere else, the documentation does not describe the layout of the weight buffer w. Could you describe how to build this argument?

Also, does the function support biases which are commonly used for this layer implementation?

On a related point, the documentation says that currIdx should be >= 0 in inference while the release note for 7.5.1 says that it can be negative.

Thanks,

Guillaume

wtambellini · June 26, 2019, 7:15pm

I have the same request :

the doc of cudnnSetAttnDescriptor() does not explain the expected params
there is no example of MHA in the cudnn samples package

anon37147145 · August 24, 2019, 8:17am

Looks like this has been addressed in the latest version: [url]Release Notes :: NVIDIA Deep Learning cuDNN Documentation