pinned memory cannot be freed on one of multi-GPUs

PaulS99 · September 13, 2019, 3:29am

I try to allocate pinned memory and de-allocate it on two-GPUs case. Run 5 iterations. One GPU shows ‘Shared Memory’ usage up and down, but another just grows up. Simple codes are shown below, “shared memory” of GPU 0 up and down on Task Manager, but “shared memory” of GPU 1 is just up and up, until crashes. I am using Windows 10, Visual studio 2019 and CUDA v10.1.

The graphic cards are the same, properties are shown below.

Properties of GPU 0 : name= GeForce GTX 1080 Ti, uuid= 1484675040, major= 6, minor= 1, integrated= 0, canMapHostMemory= 1, managedMemory= 1, memoryClockRate= 5505000, memoryBusWidth= 352, sharedMemPerBlockOptin= 49152, sharedMemPerBlock= 49152, sharedMemPerMultiprocessor= 98304, computeMode= 0, ECCEnabled= 0, tccDriver= 0, deviceOverlap= 1, asyncEngineCount= 2, unifiedAddressing= 1, globalL1CacheSupported= 1, localL1CacheSupported= 1, isMultiGpuBoard= 0, multiGpuBoardGroupID = 0, hostNativeAtomicSupported= 0, pageableMemoryAccess= 0, pageableMemoryAccessUsesHostPageTables= 0, concurrentManagedAccess= 0, computePreemptionSupported= 1, canUseHostPointerForRegisteredMem= 0, directManagedMemAccessFromHost= 0

Properties of GPU 1 : name= GeForce GTX 1080 Ti, uuid= 1484675040, major= 6, minor= 1, integrated= 0, canMapHostMemory= 1, managedMemory= 1, memoryClockRate= 5505000, memoryBusWidth= 352, sharedMemPerBlockOptin= 49152, sharedMemPerBlock= 49152, sharedMemPerMultiprocessor= 98304, computeMode= 0, ECCEnabled= 0, tccDriver= 0, deviceOverlap= 1, asyncEngineCount= 2, unifiedAddressing= 1, globalL1CacheSupported= 1, localL1CacheSupported= 1, isMultiGpuBoard= 0, multiGpuBoardGroupID = 1, hostNativeAtomicSupported= 0, pageableMemoryAccess= 0, pageableMemoryAccessUsesHostPageTables= 0, concurrentManagedAccess= 0, computePreemptionSupported= 1, canUseHostPointerForRegisteredMem= 0, directManagedMemAccessFromHost= 0

Visual studio 2019 is a builder with x64 platform. Cuda compiling by
“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\nvcc.exe” -gencode=arch=compute_30,code="sm_30,compute_30" -gencode=arch=compute_35,code="sm_35,compute_35" -gencode=arch=compute_37,code="sm_37,compute_37" -gencode=arch=compute_50,code="sm_50,compute_50" -gencode=arch=compute_52,code="sm_52,compute_52" -gencode=arch=compute_60,code="sm_60,compute_60" -gencode=arch=compute_61,code="sm_61,compute_61" -gencode=arch=compute_70,code="sm_70,compute_70" --use-local-env -ccbin “C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.21.27702\bin\HostX86\x64” -x cu -I…\E57ToPly -I…\mapmap -I…\rayint -I…\mve -I"mvs-texturing" -I…\eigen -I…\libpng -I…\tbb\include -IC:\dev\boost_1_71_0 -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include" -I"C:\Program Files\NVIDIA Corporation\NvToolsExt\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include" --keep-dir x64\Release -maxrregcount=0 --machine 64 --compile -cudart static -DCPU_REPLACE_GPU_WHEN_RAM_LACK -DNOMINMAX -DNDEBUG -DBOOST_FILESYSTEM_NO_DEPRECATED -D_WINDLL -D_UNICODE -DUNICODE -Xcompiler "/EHsc /W3 /nologo /Ox /Fdx64\Release\vc142.pdb /Zi /MD " -o x64\Release\x.cu.obj “x.cu”

main()
{
for (int ii = 0; ii < 5; ++ii)
{
int gpuIndex = 0;
gpuErrchk(cudaSetDevice(gpuIndex));

std::vector<uint32_t*> textures;
textures.resize(50);

for (int i = 0; i < textures.size(); ++i)
{
        gpuErrchk(cudaHostAlloc((void**)& textures[i], 4096 * 4096 * sizeof(uint32_t), cudaHostAllocDefault)); 
}

    // set GPU 1 and get memory info
    gpuIndex = 1;
gpuErrchk(cudaSetDevice(gpuIndex));

size_t freebyte;
size_t totalbyte;
cudaError_t cuda_status;
cuda_status = cudaMemGetInfo(&freebyte, &totalbyte); // This causes "shared memory" GPU 1 up and up

    // back to GPU 0
gpuIndex = 0;
gpuErrchk(cudaSetDevice(gpuIndex));

for (int i = 0; i < textures.size(); ++i)
{
    gpuErrchk(cudaFreeHost(textures[i]));
}
batchHighTextures.clear();
}

auto cudaStatus = cudaDeviceReset();

return;

}

PaulS99 · September 13, 2019, 3:48am

I have two computers with the same two groups of graphic cards. One computer can free memory normally on both GPUs, the other shows this problem.

PaulS99 · September 13, 2019, 7:59am

Found the differences between two computer. The normal one was using Nvidia driver 26.21.14.3136 (older), the problematic one was using driver 26.21.14.3615 (or 26.21.14.3630). So the latest driver 26.21.14.3630 has such bug?

Robert_Crovella · September 13, 2019, 11:02pm

Don’t know. You may wish to file a bug using the instructions linked in the sticky post at the top of this sub-forum.

A similar report may be here:

[url]https://devtalk.nvidia.com/default/topic/1062986/cuda-programming-and-performance/cudafreehost-not-clearing-allocated-host-memory-when-multiple-devices-are-used-/[/url]

Topic		Replies	Views
CUDA multiple gpus page-locked memory malloc and free CUDA Programming and Performance cuda	0	340	August 14, 2020
MultiGPU Pinned and pageable memory CUDA Programming and Performance	0	826	June 9, 2010
CUDAFreeHost() not clearing allocated host memory, when multiple devices are used. CUDA Programming and Performance	2	1252	November 13, 2019
cuMemHostUnregister does not release shared GPU memory on 441.xx Studio Driver with 2 GPUs CUDA Programming and Performance	2	566	January 14, 2020
Pinned vs. Unpinned memory CUDA Programming and Performance	2	5003	July 31, 2008
Memory allocation problem with multi-gpu (Tesla k80), possible cuda driver bug CUDA Programming and Performance	5	4143	February 20, 2016
Problems with Pinned Memory on multi S1070 system CUDA Programming and Performance	3	909	January 15, 2011
Multi-GPU Memory Allocation behaves differently with different order of allocation CUDA Programming and Performance	1	823	June 15, 2013
ERROR: /buildAgent/work/99bede84aa0a52c2/source/gpucommon/src/PxgCudaMemoryAllocator.cpp (59) : warning : Failed to allocate pinned memory Isaac Gym	0	584	June 4, 2023
fine control of memory pinning in CUDA CUDA Programming and Performance	12	16828	May 1, 2008

pinned memory cannot be freed on one of multi-GPUs

Related topics