Hi,
Memory usage may be different across platform due to architecture.
You can check this API for setting the limitation for CUDA context:
https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__CTX.html#group__CUDA__CTX_1g0651954dfb9788173e60a9af7201e65a
[i]--------------------------------------------------------------------
CUresult cuCtxSetLimit ( CUlimit limit, size_t value )
Set resource limits.
Parameters
limit
- Limit to set
value - Size of limit
Returns
CUDA_SUCCESS, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_UNSUPPORTED_LIMIT, CUDA_ERROR_OUT_OF_MEMORY, CUDA_ERROR_INVALID_CONTEXT
Description
Setting limit to value is a request by the application to update the current limit maintained by the context. The driver is free to modify the requested value to meet h/w requirements (this could be clamping to minimum or maximum values, rounding up to nearest element size, etc). Note that the CUDA driver will set the limit to the maximum of value and what the kernel function requires. The application can use cuCtxGetLimit() to find out exactly what the limit has been set to.[/i]
Thanks.