Debugging/profiling cluster from a node without a GPU...

joemarlin · September 7, 2017, 6:53pm

I have used Nsight on Windows, but am new to the Linux implementation. (Microsoft’s evil empire may be many things, but at least they assimilated a very user friendly IDE…)

I am running into some very basic problems, and was was hoping someone has some insight on how to fix them.

Is it possible to run the profiler and debugger from a head node that does not have a GPU?

We have a cluster with 16 GPU nodes on remote machines, but they are called as needed by the main processing task.
When I try and profile or debug test code, I am getting an error that says no GPU found.

The main code is a rather convoluted set mix of C++/C and XML that has been mixed together for the past 20 years. All glued together with CMake
.
In order to run it, a script must be called and an mpirun command executed.

How do you change the ‘Run’ button in the IDE to:
a) Execute a series of ‘shell’ commands? ( e.g. mpirun, bash scripts)
b) Make the execution directory a specific location?
The mpi_hostfile and other configuration files use paths relative to the starting directory, and each user will have his own configuration. (Hard-coding paths might work, but would not be a preferable solution.)

Thank you for any help or direction to relevant threads.

veraj · October 11, 2017, 5:03am

Hi, joemarlin

Answer your question below

1)Nsight do support remote run/profile/debugging, you can refer to CUDA Toolkit Documentation

2)Run->Run/Debug/Profile configurations allows you to specify the target application you want to execute, for Run/Profile, it is OK to use script here as long as it is executable. For Debug, if using script, it will report “File format not recognized” by cuda-gdb, so a real application need here.

3)For mpirun, you can refer to Profiler :: CUDA Toolkit Documentation

Hope these helps.

Best Regards
Vera J