Can I reserve CUDA cores for my application?

Hi,
For a new project I need to write an application that uses the GPU. It will need to do calculations on the CUDA cores. I would like to reserve almost all of the CUDA cores for this. Like 90%.
Is that possible? I do not care if the gui gets a bit slower because of that.

I know there is a way to reserve CPU cores for your application. Can such a thing be done with the GPU too?

Hi,

No, the GPU resources on Jetson are shared in a time-slicing manner.
Thanks.

And how can we influence this?
Can we request for more or bigger slices? Or can we disable the other contenders?

Hi

Currently, we don’t provide a way to control the GPU scheduler.

Is it possible for your use case that use just one process?
As the same process shared the identical CUDA context, you can check if Green Contexts can help.

https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__GREEN__CONTEXTS.html

Thanks.

Thanks AastaLLL.

This Green Contexts looks very interesting. But I’m not sure if I understand correctly.

Do I understand correctly that when I create a Green Context that I can specify how many SMs to use and that these SMs will only be available for all threads within the process in which the context was created?

That would be great. If not, which part did I not understand correctly?

Hi,

To use the Green context, all the tasks need to share the same CUDA context.

On Jetson, this will require all the tasks to run in the same process.
Since the different process on Jetson has its own CUDA context.

Thanks.

I’m not a Linux expert.
How can I run all GPU tasks in the same process? I guess there is at least some graphics task. Can I run one of my tasks on the same process of the graphics?

Hi,

You will need to implement a process and it creates different threads for the task.
It should be okay if some default rendering task cannot use the same context.
These jobs are usually lightweight and won’t occupy too many GPU resources.

Process 1: Rendering
Process 2: CUDA pre-processing, inference, CUDA post-processing.

Processes 1 and 2 will share GPU in a time-slicing manner.
But as process 1 is relatively lightweight, it won’t occupy GPU for a long time.

In process 2, you can decide the resources for pre-processing, inference, and post-processing via green context.
With careful design, you should be able to reserve the resources for a particular task.

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.