CuBLAS: cublasGemmBatchedEx / cublasGemmStridedBatchedEx support for DP4A

Does the CUDA 10 versions of “cublasGemmBatchedEx” and “cublasGemmStridedBatchedEx” support DP4A instructions?

In the CUDA 10 documentation, it does not list CUDA_R_32I as a supported compute type for the batched/strided versions. This is in contrast to "“cublasGemmEx” (ie. non-batched, non-strided) which explicitly lists CUDA_R_32I as a supported compute type (ie. 8-bit INT multiply with 32-bit INT accumulate thus allowing use of DP4A instructions).

So is DP4A not supported for batched/strided GemmEx?