I’m trying to use cudnnAddTensor’s broadcasting feature according to API reference (https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#cudnnAddTensor):
[i]This function adds the scaled values of a bias tensor to another tensor. Each dimension of the bias tensor A must match the corresponding dimension of the destination tensor C or must be equal to 1. In the latter case, the same value from the bias tensor for those dimensions will be used to blend into the C tensor.
Note: Up to dimension 5, all tensor formats are supported. Beyond those dimensions, this routine is not supported[/i]
I’ve got 32bit float tensors A = dims(32,32,1,1,768) and C = dims(32,32,1,128,768). But cudnnTensorAdd fails with CUDNN_STATUS_NOT_SUPPORTED. Does anyone have any clue how to resolve this? Thank you in advance.