Cuda 7.5 failing on Amazon EC2

Recently my Cuda 7.5 Windows Server images have been failing. I can no longer RDP to them.

I have tried a fresh Server 2012 install and Cuda drivers and I can not connect to them after the CUDA toolkit is installed. Including multiple restarts. And I have tried in multiple regions on g2 2x and 8x large.

I think they have recently changed the servers somehow and its no longer compatible with CUDA 7.5

Anyone else having problems?

This is a definite problem. Cuda 7.5 does not install properly on EC2 Windows 2012 R2 Servers anymore. It was working a few weeks ago but not anymore. Something at Amazon changed.

My Cuda 6 images are working fine. as soon as i upgrade, i cannot reconnect to the servers, even with a reboot.

Even new 2012 R2 images as soon as you install 7.5 it will stop responding and i have to terminate it.

Please fix Amazon or Nvidia if youre reading!

Luke

I doubt Amazon reads these forums. On the NVIDIA side, we’re aware of an issue with CUDA 7.5 on Linux AMIs:

[url]https://devtalk.nvidia.com/default/topic/880246/cuda-setup-and-installation/cuda-7-5-unstable-on-ec2-/[/url]

A fix for that issue is in the works (expected to be fixed in a future GPU driver).

One possibility would be to wait for a fixed linux driver to appear, and then try a Windows driver that is released subsequent to that.

In the meantime, the workaround (for the above linux issue) is to use a previous CUDA version.

You can also file a bug at developer.nvidia.com (recommended)

There may also be ways to log issues with Amazon, but this forum is not for that.

I tried to submit a bug and their bug server is down… hahaha. classy.

Everything is broken with nvidia right now.