Hi everyone. I have some CUDA code I need to urgently improve the performance of. Does anyone know of any service I can use to run my CUDA program (emulation isn’t sufficient) or willing to set 1 up for me briefly (e.g. 2 weeks). Preferably, I want to use a compute capability 1.2 device. I see there’s the Hoopoe cloud service, but it’s not ready yet.
I have access to Tesla 1060 at work, but they don’t have any laptops with NVIDIA GPUs to loan. I’ve ordered parts to build a desktop, but I ordered the just released Gigabyte H57m-USB3 (definitely want usb3), which won’t arrive until after 2/5.
Update. For $25 (basic service), you only get 900s GPU time and 7200s CPU time. They use a crude method of measurnig GPU time. You have to call ze_lock ze_unlock from the shell before launching your program, which will include CPU time as well. I’ll have to reduce my run times to < 1s so I can get about 900 executions. Still, a pretty good deal.
900s / month for $25, and $0.02/s for overtime. At 1st, I thought this was expensive, but I only need it for testing purposes, and the GPU time is only when you’re program is running (wall clock time for entire program). Look here to see how they
This billing scheme strikes me as kind of ridiculous. Are you competing with multiple users on the same box for GPU time? In that case, I could see these API calls might be a sort of GPU mutual exclusion hack combined with a logging mechanism.
OK, after using the service for slightly more than a week, here’s my evaluation:
I got my account created ~day after I subscribed. It seems the machines and support staff are located in Singapore. I was assigned to bronze.zetaexpress.com and was told the machine’s only lightly loaded, so they will count neither my GPU or CPU usage. I still needed to use ze lock and ze release to get a GPU. I wrapped my executable in a shell program that catches ctrl-C so it always calls ze release.
I believe the OS is 64bit RedHat enterprise. The libraries are a bit old. It’s still using libc5 and barely has any development libraries/includes installed. I spent about an hour to copy all the image libs I need and recompiling some due to missing functions in libc5. Also scp & rsync were broken (rsync uses scp), which I notified them about, but rather than fixing, I came up with a clever solution:
After I got everything to compile, my program runs exactly as expected. The wall clock time is the same as I got on Windows Server to 2 figures. Infinite GPU loops can be aborted with ctrl-C, which I can’t figure out how to do on Vista. Overall, the Teslas worked solidly for me and I don’t know what else to say. I didn’t run anything lengthy, so I can’t say anything about stability, but Tesla is designed for high reliability.
If there’s anything else you want to know, feel free to ask.