Confirming Multi-GPU Use

As I am slowly exploring the use of the Tesla S and MPI with a 4-core box, I began to wonder: is there a way to confirm that multiple GPUs were actually being used with my accelerated region(s)? Like, say, a more verbose ACC_NOTIFY message but with a device number? Or maybe a top/htop for GPUs?

I’ve heard about using GPU temperatures, but I’m not sure if that works with a Tesla S box, let alone that the Tesla+Host are in a rack that I’m accessing remotely.

Hi Matt,

You can use ACC_NOTIFY but need to make sure it’s in each processes’ environment. So either set ACC_NOTIFY=1 in your .cshrc or .bashrc file, or wrap your application in script which first sets the environment and then runs the app. mpirun would then launch the script, not the app.

  • Mat