Engineyard is a company that deals with Ruby web apps, often with distributed cloud computing.
They announced a contest yesterday, starting on the morning of July 20, and it lasts one day.
The basic idea is they give a 160 bit SHA1 hash value, and you have a limited dictionary and mixing rules to try to make a phrase which hashes as close to that target as possible. An exact match is likely impossible, but you just want to match more bits than anyone else.
The prizes are several new iPhones, and some $2000 of cloud compute credit.
The only practical attack to win is raw brute force… compute trillions of hash candidates and keep the best one.
Gee, I wonder if anyone on this forum knows of an accessible technology which has a lot of horsepower? CUDA to the rescue!
So I decided to spend this Saturday afternoon putting together a CUDA program to attack this exact contest. I could keep it private and try to win the contest myself, but it will be a lot more fun to let everyone use it! And we still have over a day to tweak the code if you want to start hacking.
I’m attaching source and compiled code here for a ready-to-run contest cracker! If you win the contest, the prize is yours to keep! (But please, tell us on the forum!)\
This code is crude but it does power through over 50M hashes per second on a Quadro… I bet a 285GTX would do 70M.
It’s run from command line, so if you have multiple GPUs, you can run one instance on each GPU. (No time for threading niceties!). No time for a pretty GUI.
The technical strategy
If you read the SHA1 algorithm, you’ll see that there’s an 80 word lookup table. Ideally we’d like to have every thread do a compute, but there’s not enough shared memory to make such a table. But luckily, if we look at the algorithm, it’s possible to reorder the computation of the table to use just 16 values in a rolling array per thread. At 192 threads per block (for register stall efficiency), that’s 192164=12K… which fits perfectly into our 16K of shared memory.
The strategy is to have the CPU pick some initial words like “tokyo memcached” and then the GPU takes over and powers through 93^5 variations of that phrase by appending 5 characters to the end. (There are 93 printable ASCII characters, and we can append 5 of them.) Each block assigns the first three chars, and inside the block, the threads compute the 93*93 variations.
The host app doesn’t do any work after setting up the base string (which takes a millisecond.). The GPU compute takes about 2 minutes to try the 7 billion hashes. After the GPU finishes, the CPU picks a new base string and the GPU is launched again.
To keep the GPU from locking up, the GPU’s speed is timed as it does blocks. Each kernel launch does perhaps 400 blocks. If that takes a short time like 10 ms, the number of blocks is increased for the next launch. The goal is to get kernels of about 50ms… enough to keep your machine responsive, but not so short that you get efficiency losses due to idle SMs.
This code here is working right now… it’s ready to use in the contest! But hopefully we can tweak it before launch and I invite all hackers to speed it up and find problems before launch.
Even if you’re not interested in hacking, you can still enter the contest! And if you win, the prize is yours. This is your chance to show off your 4-board 295GTX monster! I’d like to show how much power our GPUs can harness. Please also let other people know that there’s a tool for people to use… hopefully we can get it easy enough to use for even non-hackers.
I am NOT involved with the contest or company, I just found it via Twitter and thought it would be great to brute force on a GPU.
Easy! This is windows, but it should work on Linux and Mac. I’m on my laptop now so I haven’t had a chance to try Linux or multi-gpu.
To compile, just:
nvcc -I "E:\CUDA\common\inc" gpusha1.cu e:\cuda\common\lib\cutil32.lib -o gpusha1search.exe
To run, just open a shell and use a command like:
gpusha1search.exe -device 0 6cac827bae250971a8b1fb6e2a96676f7a077b60 Cloud Ruby DHH one eight six active record controller data rspec mongrel MySQL postgresSQL tokyo MRI jruby rubinius memcached exception metaprogramming reflection
Attachments in the next post, let me zip them up with a reference to this thread.