cudnnOpTensor missing ops vs TensorRT

It would be very useful if cudnnOpTensor could be extended to support common element-wise/unary operations that are missing but have since been added to TensorRT only.

Currently cudnnOpTensor supports the following ops:

element-wise: add, mul, min/max

unary: sqrt, neg (aka not)

In contrast, TensorRT supports:

element-wise: add/sub, mul/div, max, pow

unary: exp/log, sqrt, abs, neg, recip

It would be great if cudnnOpTensor could be updated to add the currently TensorRT-only ops:

element-wise: sub, div, pow

unary: exp/log, abs, recip

I’ve been wishing these were available for a long time, but it would especially make sense to add them now that they are available via TensorRT!

Is there any possibility of these being added?


I second this. I built some GOLANG wrappers for cudnn. I am using those to build my own deep learning toolkit. If I want to use a training algorithm like adam, adagrad, or adadelta. Then, I have to use another context using my own kernel. Which would be fine if more than one context can run on a gpu at a time. If I am training several networks concurrently it would mean halting everything just to do a weight update for each network. For a library that touts “cuDNN accelerates widely used deep learning frameworks, including Caffe2, MATLAB, Microsoft Cognitive Toolkit, TensorFlow, Theano, and PyTorch.” It surprising lacks some fundamental functions used in deep learning. Please do something about this.


ps. Ben, you can do subtract just change one of the scalars to a negative 1 when using add. but I agree with you that everything else should be included.