I am using MPS (Muti-Processing Service) and I was able to create a context for each process and set a specific number of SMs (Streaming Multiprocessors) to each context using this code:
affinity.type = CU_EXEC_AFFINITY_TYPE_SM_COUNT;
affinity.param.smCount.val = 50;
cuCtxCreate_v3(&context, &affinity, 1, 0, 0);
But using this code I can only set the number of SMs only once when creating a context. My question is that how can I update this parameter after the context is created? I’m searching for somehow the opposite of what cuCtxGetExecAffinity does, let’s say I need a cuCtxSetExecAffinity function (which does not exist).
At the current time, it’s not possible to change the active thread percentage of a context after it is created. See here.
This link mentions only CUDA_MPS_ACTIVE_THREAD_PERCENTAGE Is it the same for the CU_EXEC_AFFINITY_TYPE_SM_COUNT variable? Are these variables the same?
If a context with execution affinity is created at kernel launch time, the user will observe a sudden increase in latency and memory footprint as a result of the context creation. To avoid paying the latency of context creation and the abrupt increase in memory usage at kernel launch time, it is recommended that users create a pool of contexts with different SM partitions upfront and select context with the suitable SM partition on kernel launch:
I think it should be fairly evident from that, that once a context is created, the SM count cannot be updated on that context.
This particular paragraph suggests using a context pool to avoid context creation overhead. But it does not claim that SM count can not be changed after a context is created.
It is obvious to me now that CUDA_MPS_ACTIVE_THREAD_PERCENTAGE can not be changed after a context is created. The question is, is it the same for CU_EXEC_AFFINITY_TYPE_SM_COUNT? Do they have different mechanisms besides using percent/count?
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.