benchmark vs a cluster how fast are my teslas

I have recently acquired a Silicon Mechanics 3xTesla machine, and I want to benchmark it against my cluster.
I was thinking HPL and FFT benchmarks, and have tried(unsuccessfully so far) to compile HPL against cublas.
I did some searches and have seen some beta software and 3rd party apps that seem to be focusing on this, but I am wanting to stick with Nvidia software for the moment.

Also, what benchmark did Nvidia use to get their 83 gflops per tesla?