Scan code from UIUC ece 498 AL

Dear community members,

I am wondering if anyone has the source code to the Hall of Fame scan code from UIUC ECE 498 AL.The scan optimization by Lichterman, David. has very good performance. It scans 16,000,000 floating elements around 6.4 ms. I found it from a post dating back to 2007 in this forum.The post is as follow:
[url]https://devtalk.nvidia.com/default/topic/378629/?comment=2701757[/url]. It seems that all the users on the above post are no longer active. My scan implementation is not as fast as the implementation by Lichterman, David listed in the Hall of Fame section. When I tried to examine the code by David, it seems it was already taken down.

Does anybody know some source code with high performance. The performance from Nvidia sample
code [url]http://developer.download.nvidia.com/compute/cuda/1.1-Beta/x86_website/samples.html[/url]is not as fast as the speed mentioned in the post.

Thank you for your comments.

If you’re looking for a fast scan, try cub. It is open-source.

[url]http://nvlabs.github.io/cub/[/url]

Thank you. In addition to CUB, other libraries are available, e.g., Thrust, CUDPP. I’ve used Thrust library, and it’s not very efficient. The high level API doesn’t allow users to fine tune the kernel parameters.