are there examples about CUBLASLT_MATMUL_DESC_BIAS_POINTER?

Hi,

I am trying to implement D = A * B + C where C is a vector of bias. Because cublasLtMatmul currently only supports the case where C == D and Cdesc == Ddesc, I turns to the solution directly setting the bias when I found that I can use cublasLtMatmulDescSetAttribute, CUBLASLT_MATMUL_DESC_EPILOGUE and CUBLASLT_MATMUL_DESC_BIAS_POINTER to satisfy my purpose.

I set CUBLASLT_MATMUL_DESC_BIAS_POINTER with a fake pointer, then I load the bias later and re-set it.

But I got CUBLAS_STATUS_NOT_SUPPORTED with cublasLtMatmulAlgoGetHeuristic if the CUBLASLT_MATMUL_DESC_EPILOGUE and CUBLASLT_MATMUL_DESC_BIAS_POINTER was set.

So I searched for examples, there are nothing. I can set the CUBLASLT_MATMUL_DESC_BIAS_POINTER with a real one to try it again, but I’m not sure if it works, and I think that I may encounter other problems in future, so I want to ensure this mechanism is really ready to use, and hope that I can find some example to help me properly use it.

Any help or clue is appreciated!

I have tried it with a real pointer:

static const uint32_t epi_bias[] = { CUBLASLT_EPILOGUE_BIAS };
uint16_t *host, *device;
    ret = cudaHostAlloc(&host, (size_t)(out * sizeof(uint16_t)), cudaHostAllocMapped);
    assert(!ret);
    ret = cudaHostGetDevicePointer(&device, host, 0);
    assert(!ret);
    ret = cublasLtMatmulDescSetAttribute(m, CUBLASLT_MATMUL_DESC_EPILOGUE, epi_bias, sizeof(uint32_t));
    assert(!ret);
    ret = cublasLtMatmulDescSetAttribute(m, CUBLASLT_MATMUL_DESC_BIAS_POINTER, (const void *[]) { device }, sizeof(const void *)); // real now    
    assert(!ret);
    ret = cudaFreeHost(host);
    assert(!ret);

It fails either.

ret = cublasLtMatmulAlgoGetHeuristic(hwd, m, w, i, o, o, pref, 1, result, &count)

It returns CUBLAS_STATUS_NOT_SUPPORTED.

This mechanism is not working yet ?