How to set a fixed tile size in cublas?

Hi there,

I found in this link
https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#tile-quant, figure 7 shows a Tile quantization effect with NVIDIA A100-SXM4-80GB, CUDA 11.2, cuBLAS 11.4. and says it is Measured with a function that forces the use of 256x128 tiles over the MxN output matrix. However, as far as I know, cublas cannot set a fixed tile size. Please correct me if i am wrong. I was also wondering how to set a fixed tile size in this experiment (Figure 7)?

Hi! It should be possible to set the tile size, though in practice it might be a bit tricky. See https://github.com/NVIDIA/CUDALibrarySamples/~/cuBLASLt/LtDgemmPresetAlgo/sample_cublasLt_LtDgemmPresetAlgo.cu#L89 for an example that uses cublasLtMatmulAlgoConfigSetAttribute() to set the tile size. You might want to also check that the desired tile is supported by the selected algorithm, see cublasLtMatmulAlgoCapGetAttribute() for how to get a list of supported tiles.