cuMemcreate produce NVMAP_IOC_GET_FD failed: Bad address error

woosungkang · September 4, 2024, 8:31am

Hello everyone,

I’m encountering an unexpected error when using cuMemCreate on my system. Below are my system details:

System Environment:

Hardware: Nvidia Jetson Orin Nano
Software:
- Output of /etc/nv_tegra_release:

# R36 (release), REVISION: 3.0, GCID: 36923193, BOARD: generic, EABI: aarch64, DATE: Fri Jul 19 23:24:25 UTC 2024
# KERNEL_VARIANT: oot
TARGET_USERSPACE_LIB_DIR=nvidia  
TARGET_USERSPACE_LIB_DIR_PATH=usr/lib/aarch64-linux-gnu/nvidia

CUDA version: CUDA 12.2

Problem:

When I try to create multiple memory handles using cuMemCreate, I encounter the following error when the number of handles exceeds 900, (I used 2MB as the chunk size and trie to allocate 2GB, meaning allocating 1024 handles)

NVMAP_IOC_GET_FD failed: Bad address

Code Snippet:

Here’s the partial code where the error occurs:

std::vector<int> prepareGpuMemory(CUdeviceptr& d_ptr, size_t& total_chunks) {
    size_t total_size = TOTAL_ALLOC_SIZE;  // Predefined allocation size (1GB)
    total_chunks = (total_size + CHUNK_SIZE - 1) / CHUNK_SIZE;

    CUDA_CHECK(cuMemAddressReserve(&d_ptr, total_size, 0, 0, 0));

    CUmemAllocationProp prop = {};
    prop.type = CU_MEM_ALLOCATION_TYPE_PINNED;
    prop.location.type = CU_MEM_LOCATION_TYPE_DEVICE;
    prop.location.id = 0;
    prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR;

    handles.resize(total_chunks);  // Initialize the global 'handles' vector
    std::vector<int> shareableHandles(total_chunks);

    std::cout << "[INFO] Allocating and exporting " << total_chunks 
              << " memory chunks (" << TOTAL_ALLOC_SIZE / (1024 * 1024) 
              << " MB in total)." << std::endl;

    for (size_t i = 0; i < total_chunks; ++i) {
        CUDA_CHECK(cuMemCreate(&handles[i], CHUNK_SIZE, &prop, 0));  // Assign to global 'handles'
        CUDA_CHECK(cuMemExportToShareableHandle(&shareableHandles[i], handles[i], CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR, 0));

        // Progress update after every 10% of chunks are allocated
        if ((i + 1) % (total_chunks / 10) == 0 || i + 1 == total_chunks) {
            std::cout << "[INFO] " << (i + 1) << "/" << total_chunks 
                      << " chunks allocated and exported." << std::endl;
        }
    }

    return shareableHandles;
}

Question:

Has anyone experienced a similar issue with cuMemCreate on Jetson platforms? Is there a limitation on the number of memory handles I can create? Could this be a driver or kernel-related issue?

Any suggestions or insights would be greatly appreciated!

Thanks in advance.

Robert_Crovella · September 4, 2024, 4:07pm

You’re going to find a lot more Jetson platform people on the Jetson forums.

Topic		Replies	Views
cuMemCrreate only works with vector objects CUDA Programming and Performance cuda	4	389	September 6, 2023
Using CUDA virtual memory API for host allocation CUDA Programming and Performance	4	25	May 17, 2025
cuMemAlloc_v2 return address out of range CUDA Programming and Performance	7	1994	June 11, 2019
cuMemAlloc() How to use in __device__ CUDA Programming and Performance	8	7132	June 29, 2008
CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR and CU_MEM_CREATE_USAGE_TILE_POOL? CUDA Programming and Performance	0	425	June 19, 2022
CUDA IPC - Virtual Memory API (cuMemImportFromShareableHandle CUDA_ERROR_INVALID_DEVICE ) - CUDA 11.3 CUDA Programming and Performance	4	2106	May 19, 2023
Solved: Memory Allocation Problems CUDA Programming and Performance	2	4088	September 7, 2015
Jetson TX2 cudaMalloc() failed with error all CUDA-capable devices are busy or unavailable Jetson TX2 cuda	9	1046	September 13, 2023
cuPointerSetAttribute error(CUDA_ERROR_NOT_SUPPORTED) with CUDA virtual memory management API CUDA Programming and Performance cuda	4	244	May 22, 2024
Error on accessing GPU CUDA Programming and Performance	2	685	May 29, 2011

cuMemcreate produce NVMAP_IOC_GET_FD failed: Bad address error

Related topics