I am trying to optimize my code using
After profiling with Nsight Systems, it appears that these operations are using the local memory pool.
cudaMemPoolTrimTo is able to release the
localMemoryPoolSize , the
localMemoryPoolUtilizedSize continues to increase.
My question is, “What does
As an experiment, I checked the GPU memory using the resource monitor in the task manager, but I did not observe any continuous increase in memory usage, similar to what was observed with the
localMemoryPoolUtilizedSize in the profiling results.