I have a single Tesla K80 (not sure what else you need to know) running on a VM provided by AWS. The planned application will include both Gromacs and NAMD. Individual invocations of both programs could be run at any time. The big question has been whether the Tesla K80 GPU can be shared between applications (concurrent execution). Can it? My impression is that there will be limits - possibly even that only one invocation would be permitted at a time. But I need verification.
Also, how would you go about restricting the usage so that whether it is shareable or not, users could not overload it by running too many jobs at one time. My initial thought is that jobs should be submitted through a queuing system or resource manager (SGE, UGE, Torque, Slurm, MOAB, etc) to prevent users from stepping on each others’ applications. But, again, I would like verification.
It seems that no matter what, some facility would have to manage the GPU resources to avoid applications from stepping on each other. But this is territory I haven’t hit with currently available technology, so I’d like to know what you all know - especially if you have a reference to tell me categorically whether it can be done, and if so, how.
Thank you in advance.