Recently, Penguin Computing announced an offer for renting time on their clusters optimized for HPC work, and specifically on machines having Tesla units attached. I’m very interested in that kind of service - I do my work as an external consultant, but I prefer to do all of my development on a laptop, while having arranged with my employers to provide me with remote access to Tesla equipped machine or two, for occasional testing and profiling work. This worked very well for me so far, as I had no need to spend money, or space in my apartment, for bulky and noisy towers; however, oftentimes this kind of arrangement is not working very well for my employers: usually, they do not have guys on board with that much expertise in administering this kind of machine, and the machine does not get utilized very much anyway, so the setup is not that cost-effective to them. So I think having this kind of resource on-demand would be great, and I was crossing my fingers for more of these offerings to appear; thus, I was very excited after learning about Penguin Computing offering, but I’d also like to know does anyone else in forum is aware of, or have any kind of experience, with other alike offerings?
What is great about Penguin Computing entering this market is that these guys have great expertise in HPC clusters. And from what I was able to learn so far - they did it right: user is provided with remote access over SSH as expected, hardware choice is very good (high-end Intel processors, InfiniBand connections if needed, GPU access if needed), and cluster is managed by Scyld (logically, as Scyld is their product; but Scyld is also, in my experience, very nice choice, from user perspective, for that kind of setup). However, there is hefty “setup” price attached ($5000, and it seems no hours are included in this), and core hours are not that cheap either (I was told pricing starts from $0.70 per hour); and on the other side it is rather hard (not much details about the offer available: no public mailing list or forum to inquire, sales guys very kind but somehow slow in providing details) to get more precise information about various peculiarities in using that kind of system (for example: how exactly hours used are calculated, what are policies about drivers and software stack updates, etc.).
Out of other alike offerings, I was long ago aware of hoopoe, but seems like there is no progress from alpha testing phase from these guys long time ago, and on the other side recently I’ve encountered alike offer from ZetaExpress, but I was not inquiring about details (however, their pricing info seems to be publicly available). So I would be interested in eventual experience of forum members on any kind of system of this type, as well as on general opinion of these offerings. I still think even with pricing as high as it seems for Penguin Computing offering, that this kind of service could be cost-effective, at least for the development purposes in the kind of setup I described above: typical high-end multi-GPU machine would cost around $10000, and for that money one should be able to buy enough hours for couple years of periodical code testing and profiling, and still save himself from trouble of maintaining needed hardware, accompanying power bills, etc.