Hi everybody.
I am looking for a way to calculate pairwise euclidean distances between datasets or inside a dataset.
By this I mean given two sets of features (x1,…,xn), (y1,…,ym) compute the matrix D where Dij=dist(xi,yj).
The application is to compute RBF kernels, nearest Neighbours or K-Means clustering.
I tried to implement a kernel similar to the matrix multiplication in the SDK, which is reasonably fast.
But as Alex Smola posted in his blog, an easy alternative method
is using the second binomial formular and a matrix product.
Using the matrix product in CUBLAS, I get an algorithm that is 3 times faster than my kernel.
I was wondering if there is a way to implement this computation that computes the distances directly,
without computing the matrix multiplication first, that is at least as fast as using the CUBLAS
matrix product.
Any comment on that would be helpful.
Cheers,
Andy