Torque and NVidia GPUs Help needed to configure Torque to work with NVidia GPU


Here is the problem :

I must configure a computer equipped with a CUDA graphics card (8800 GTX at the moment) to work as a cluster node, with Torque.

Torque 2.1.8 is currently working well, using the 4 cores of the Core 2 Quad and the 2 GB of RAM but it does not use the graphics card.

I think I need to configure the server to use a special ressource with something like that

[codebox]File torque/server_priv :

hostname np=4 gpu=1 thread=128[/codebox]

And mom (component installed on the node) to use the GPU

Any ideas ?


Then your job configuring Torque is basically done. Presuming your cluster is heterogeneous (ie. not all nodes have graphics cards), then you will need to create a resource for the node or nodes with GPUs. Jobs which need the GPUs must then specify that resource so that the scheduler knows to allocate your job nodes which have the correct hardware.

But that is incidental, because nothing you do to the cluster scheduling software is going to automagically make your application use the GPU. You need to have an application with CUDA support and the appropriate NVIDIA runtime libraries installed on the node. You then need to configure your job scripts to set the appropriate runtime environment variables to ensure your CUDA application can find everything it needs when MOM (or whatever launch daemon you use) forks you job on the node.

I suspect the latter requirement is a far bigger problem for you than the former.

Yes the problem is that I need to make any application use the GPU, Even non-CUDA ones >.<

If needed, I can write a program with CUDA to do it, but I would like to avoid doing it :whistling:

It would make the GPU execute the program instead of the CPU and also allocate the memory on the graphics card, and so on. I have not thought about it yet.

It will take some time, to do… :unsure:

Is there an other solution?

The GPU is a different architecture from the CPU. You can’t just run “regular” code on the GPU — you have to write for it specifically.

I think you have a fundamental misunderstanding of what CUDA is and what CUDA capable video cards can do. What you are proposing is impossible. The GPU is a completely different architecture to the host CPU. It cannot run host code, it cannot be scheduled by the host process scheduler.

Back to the drawing board, sorry…

Ok, thanks :thumbup:
With such a mistake, even the drawing board would laugh at me >.<

So, if I run CUDA programs with qsub, I suppose that CUDA will manage the resources of the card (threads and memory allocation) used by those programs.
If my supposition is right, where can I get some documentation on it?

I think this goes to the heart of your misunderstanding. You don’t run CUDA programs, you run one host program containing CUDA kernels per GPU at a given time. This means your cluster scheduling software must only fork one process per node per free GPU. An outline of how to do that it was contained in my original reply to you.

Cuda documentation can be found here: