Callback function in cuBLAS


I am wondering if cuBLAS supports callback functions similar to the cuFFT library as described here.

I would like to use the cuBLAS library for a general matrix-matrix multiplication. But before and after the GEMM I need to execute some operations, as listed below.

  1. transposition
  2. complex matrix-matrix multiplication (GEMM)
  3. calculate absolute values
  4. integration over time

It would be desirable to call processing steps (1,3,4) by callback device functions during the execution of the cuBLAS GEMM Kernel to prevent/reduce the loading of in and output data from and to the global memory. Until now I could not find any information about this in the documentation of cuBLAS or in other sources. I would like to know if cuBLAS supports callback function or will support it in future releases. If not, is there possibly another alternative without writing a complete GEMM kernel?

Thanks in advance!