NVBLAS to accelerate cholesky decomposition (potrf)


I’m trying to run a short piece of code in python(Anaconda).
Anaconda uses libmkl_rt.so for BLAS calls and I have set NVBLAS_CONFIG_FILE properly.

import numpy as np
from scipy import linalg
A = np.random.rand(10000,100000)
AtA = a.dot(a.T)

linalg.cholesky(AtA) # or linalg.lapack.dpotrf(AtA)

  • np.dot is passed to the GPU (Tesla K80)
  • However, cholesky runs only on CPU.

Doesn’t NVBLAS target potrf? any package in python or C that can achieve this goal? Only direct call to cublas?

Thanks in advance.

The routines that NVBLAS targets are listed in the documentation:


potrf is not listed there.

potrf is not part of CUBLAS, but is part of cuSolver: