Computing the Mahalanobis distance with CUDA

_domi · March 31, 2014, 7:24pm

Hi,

I’m trying to modify the code that computes the Euclidian distance from: http://home.hiroshima-u.ac.jp/tamaki/study/cuda_softassign_emicp/ to compute instead the Mahalanobis distance.
Is there an example in CUDA ? even with cuBLAS implementation ?

Up to now i found the implementation of the covariance matrix calculation:

__global__ static void
updateA(int rowsA, int colsA, int pitchA,
	const float* d_Xx, const float* d_Xy, const float* d_Xz, 
	const float* d_Yx, const float* d_Yy, const float* d_Yz,
	const float* d_R, const float* d_t,
	float* d_A,
	float sigma_p2){

...

// compute covariance matrix using cublas

      // cublasSgemm (char transa, char transb, int m, int n, int k, float alpha, 
      //              const float *A, int lda, const float *B, int ldb, float beta, 
      //              float *C, int ldc)
      //   C = alpha * op(A) * op(B) + beta * C,
      //
      // m      number of rows of matrix op(A) and rows of matrix C
      // n      number of columns of matrix op(B) and number of columns of C
      // k      number of columns of matrix op(A) and number of rows of op(B) 
	  cublasSgemm('n', 't',
	              rowsA, rowsA, colsA, // int m, n (rows of A), k (cols of A) ; not op(A)
		      1.0f,         // float alpha
		      d_A, pitchA,  // const float *A, int lda
		      d_A, pitchA,  // const float *A, int lda
		      0.0f,         // float beta
                      d_Covariance, rowsA);

...

// #define Euclid(a,b,c) ((a)*(a)+(b)*(b)+(c)*(c))
//     float tmp =
//       Euclid(Xx - (R(0)*Yx + R(1)*Yy + R(2)*Yz + t(0)),
//              Xy - (R(3)*Yx + R(4)*Yy + R(5)*Yz + t(1)),
//              Xz - (R(6)*Yx + R(7)*Yy + R(8)*Yz + t(2)) );
    
//     tmp = expf(-tmp/sigma_p^2)


     float tmpX = Xx - (R(0)*Yx + R(1)*Yy + R(2)*Yz + t(0));
     float tmpY = Xy - (R(3)*Yx + R(4)*Yy + R(5)*Yz + t(1));
     float tmpZ = Xz - (R(6)*Yx + R(7)*Yy + R(8)*Yz + t(2));

    __syncthreads();

     tmpX *= tmpX;
     tmpY *= tmpY;
     tmpZ *= tmpZ;

     tmpX += tmpY;
     tmpX += tmpZ;

     tmpX /= sigma_p2;
     tmpX = expf(-tmpX);

Not sure if the covariance using cublasSgemm is correct.
How do i compute the inverse of the covariance matrix ?

Maybe there is a faster method like in OpenCV that has function call to mahalanobis distance.
Right now i can’t switch to an OpenCV implementation because of the time constraint.

Thank You,
Domi

CudaaduC · April 1, 2014, 3:18am

In most cases you can use cuBLAS Strsm():

http://docs.nvidia.com/cuda/cublas/index.html#cublas-lt-t-gt-trsm

Also,if you are going to make a cuBLAS call from a device kernel (which I have never done before, usually call from host) , you are going to need a GPU which supports Dynamic Parallelism.

It makes more sense to call from host, does the above work correctly?

_domi · April 1, 2014, 1:48pm

Thank you for the info on inverse calculation.

The code works (that’s how it is in the reference code so i did not change it), not sure if the values passed would yield the right result. I will need to cross reference with Matlab.

I’m just surprised nobody implemented the Mahalanobis distance computation using GPU, everybody is just saying how suitable this is for GPU.

Thanks,
Domi

jakub.mitura14 · September 9, 2021, 9:15am

@_domi would you mind to share working example? and can it be used to comparing two 3 dimensional Arrays?

Topic		Replies	Views
Incorrect inversion for matrix with CUBLAS CUDA Programming and Performance	4	1293	December 22, 2016
CUBLAS VS CBLAS sgemv Benchmarking matrix-vector operations on GPU and CPU CUDA Programming and Performance	5	10029	March 24, 2014
Question about cublas demo matrixMulCUBLAS CUDA Programming and Performance	6	2191	May 11, 2015
I am calculating distance between two points without using threads but i need to calculate distance CUDA Programming and Performance	4	1541	January 21, 2015
Help with CUBLAS performance and timing issues, please help... CUDA Programming and Performance	1	3441	December 26, 2008
Incorrect results when using cublas matrix multiplication GPU-Accelerated Libraries	1	1512	April 28, 2016
Matrix inverse usng linear system solver through cublas , cublasCreate exception or something else CUDA Programming and Performance	1	4615	June 16, 2013
GTX 660 and Nano performance drop-off after sustained matrix multiplies CUDA Programming and Performance	16	795	July 15, 2022
cuBLAS convolution does not use Tensor Cores GPU-Accelerated Libraries cublas	6	2196	June 8, 2021
Calling cublasSgbmv Incorrect results CUDA Programming and Performance	1	1500	September 29, 2009

Computing the Mahalanobis distance with CUDA

Related topics