I am setting up a new home workstation to develop physics models which will benefit from using Cuda.
I have a dual core with 3.0 GHz Pentium cpu’s, and 2 GB of RAM.
I installed a GeForce 9600gt video card.
Question:
If I want to perform honest tests that compare running code which is optimized with Cuda to exploit GPU to
code which only uses CPU, do I need to install another video card to use with my monitor (so that the GeForce 9600gt
is solely used or running my model, not also used for displaying console activity)?
Additional question:
Which is the best way to set up a VNC which will allow me to display console from my MacBook Pro?
Microsoft had a free ‘remote desktop’ application that was trivial to install to allow me to view a windows PC,
but I’m having difficulties finding a similar type of application to view the linux box (it somehow doesn’t sense the
Nvidia Cuda driver when I use some 3rd party VNC applications I found on the internet).
Dreaming of getting this new box properly configured to get back to doing physics,
–Mike
The answer is “it depends”. There is no problem running CUDA kernels while sharing the GPU with an active display, and there are normally negligible performance differences (less than 5% in my experience) between a dedicated CUDA GPU and one shared with a display manager. The biggest difference is the existence of the display watchdog timer. On a GPU sharing a display, any running your kernel must take less than 5 seconds to complete, otherwise they will be killed. Whether it is an issue depends a lot on your code.
I find nxserver (or freenx) and nxclient to be the most useful remote X11 desktop solution, and I have no problems working with CUDA inside nxclient on Mac OS X. But none of the linux remote desktop suites can forward the GPUs 3D accelerated view pane (for what it is worth, Microsoft’s remote desktop also has the limitation and it cannot work with CUDA at all).
One of my codes is very similar to one of the sdk examples: The generaltion of realizations of random 2D fields via a fast Fourier transform method. I can’t imagine it takes less than 5 seconds to complete though.
If my codes are not generating video output, do I still need to worry about the 3D accelerated view pane? Basically, I just have a few terminal windows open,
one for editing code using emacs or vi, one for compiling, and one for running and monitoring some ascii text output. After an execution is complete, I run
matlab to examine the output that has been written to a file.
The 5 second limit is per kernel launch, so it isn’t necessarily as large a restriction as you might imagine. I do a lot of work with explicit finite difference and finite volume methods. In my codes, the computations are split up into a series of sequential kernel launches, each of which only requires a few tens of milliseconds to run per time integration stage. The total GPU wallclock time to finish a simulation can be in the hundreds or thousands of seconds, but it happily runs on a GPU shared a display because each individual kernel run is relatively short. On the other hand, I have some implicit sparse solvers which can hit the 5 second limit for large problem sizes. As I said, it depends on the nature of your problems and the structure of the code.
You will have no problems using NX for that type of remote access.