Hi,

I have a bunch of matrices stored in a strided fashion to be processed by cblas’ GEMMStridedBatched. Now, I would like to add a constant to the diagonal elements of all matrices, thus to perform the operation `M[i]= M[i] + c*I`

, where `I`

is the identity matrix and `c`

a constant which is the same for all matrices in the batch. I looked for a strided batched AXPY, but cuBLAS doesn’t seems to have implemented that (for now?). Do you have a hint for me how to do this efficiently? In the end, I would like to calculate `M[i] = M[i] + c*I + A[i]*B[i]`

efficiently for a batch of small matrices.

Peter