Code release: Random Ball Cover (RBC) for fast nearest neighbor

Hi All,

I’m happy to announce the release of the Random Ball Cover (RBC)
package for performing nearest neighbor search on the GPU. The RBC
is new randomized metric data structure designed explicitly for
massively parallel systems. The basics of the data structure, along
with some GPU implementation details, are described in a paper that
will apear at the ADMS workshop (at the Very Large Databases
conference):

L. Cayton, A nearest neighbor data structure for graphics hardware.

The performance of the RBC is quite promising. One experiment that is
in the paper: I ran the code on a database from robotics that contains
1 million points, each of which is 21-dimensional. On the database, I
was able to find the nearest neigbor for 1 million queries in under 4
seconds.

I tried to keep the code as simple and clean as possible. At present,
the CUDA kernels are not highly optimized, but I plan to release a
more tuned version sometime soon. I am also actively expanding the
functionality of the code. If anyone is interesting in getting
involved, please contact me!

The code and paper are available from my website:
http://people.kyb.tuebingen.mpg.de/lcayton/

I’m happy to hear any suggestions/ideas/comments.

Well congratulations, a google search for “random ball cover” returns your google code project page in the top position.

However I’d have preferred if Google had showed me a Wikipedia explaining this algorithm - or your paper even ;)

Christian

Well congratulations, a google search for “random ball cover” returns your google code project page in the top position.

However I’d have preferred if Google had showed me a Wikipedia explaining this algorithm - or your paper even ;)

Christian

Hi All,

I just released a new version of the RBC library. This update adds support
for k-nearest neighbor (NN) search (as opposed to just 1-NN search).

This addition required a substantial rewrite of a couple of basic
kernels. The code now contains a manually-implemented
(Batcher) sorting network which operates in shared memory.
Additionally, the reduction phase of NN search (both in the basic
brute-force version, and in the data structure-based search) is now
implemented as a sorting network merge.

This method is quite fast, and is also memory-efficient; in
particular, it does not require a full set of distances to be stored
in memory, which can be problematic on large data sets.

http://people.kyb.tuebingen.mpg.de/lcayton/

enjoy!
lawrence

Hi All,

I just released a new version of the RBC library. This update adds support
for k-nearest neighbor (NN) search (as opposed to just 1-NN search).

This addition required a substantial rewrite of a couple of basic
kernels. The code now contains a manually-implemented
(Batcher) sorting network which operates in shared memory.
Additionally, the reduction phase of NN search (both in the basic
brute-force version, and in the data structure-based search) is now
implemented as a sorting network merge.

This method is quite fast, and is also memory-efficient; in
particular, it does not require a full set of distances to be stored
in memory, which can be problematic on large data sets.

http://people.kyb.tuebingen.mpg.de/lcayton/

enjoy!
lawrence

How much speed-up does this new version achieve in your one million point matching benchmark over the previous release?

By the way, here’s a direct link to the research paper (for the lazy…)

http://people.kyb.tuebingen.mpg.de/lcayton/rbc-adms.pdf

Christian

How much speed-up does this new version achieve in your one million point matching benchmark over the previous release?

By the way, here’s a direct link to the research paper (for the lazy…)

http://people.kyb.tuebingen.mpg.de/lcayton/rbc-adms.pdf

Christian

That benchmark remains the same; there is new functionality in this release, but this functionality doesn’t affect that benchmark. With that said, I’ll probably release more tuned versions of the kernels at some point. The current performance is pretty good, but can be improved somewhat using the techniques that Volkov & Demmel applied to matrix multiply. On the downside, these optimizations complicate the code.

That benchmark remains the same; there is new functionality in this release, but this functionality doesn’t affect that benchmark. With that said, I’ll probably release more tuned versions of the kernels at some point. The current performance is pretty good, but can be improved somewhat using the techniques that Volkov & Demmel applied to matrix multiply. On the downside, these optimizations complicate the code.

Hi folks,

I’m releasing version 0.2.4 of the RBC code for fast nearest neighbor retrieval on a GPU. This is a fairly minor update: since version 0.2, a few bugs have been fixed, memory leaks caught, and, most importantly, the driver has been simplified considerably. This should make it a bit easier to use.

I recently submitted a paper on the RBC:
L. Cayton, Accelerating nearest neighbor search on manycore systems.

This paper contains the mathematical details underlying the method, and some detailed experimental evaluations, both on a multicore CPU and a GPU. Take a look if you want to understand the method a bit better.

The code and paper are both available from my website:
http://people.kyb.tuebingen.mpg.de/lcayton/

I should have a couple of major new versions in the coming months. enjoy!
lawrence

Hello again,

I’m releasing version 0.2.6 of the Random Ball Cover package. This is another fairly minor release: I found and corrected a few bugs, simplified the driver, cleaned up some code, and added some example input files.

I’ve moved the project over to github at http://github.com/lcayton/RandomBallCover-GPU/. You can also find the latest version for download from my webpage, which has moved to lcayton.com.

best,
lawrence