Issue with cudaMemPrefetchAsync on drive orin device

rlu1 · August 27, 2025, 12:07pm

Please provide the following info (tick the boxes after creating this topic):
Software Version
DRIVE OS 6.0.10.0
DRIVE OS 6.0.8.1
DRIVE OS 6.0.6
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other

Target Operating System
Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-300)
DRIVE AGX Orin Developer Kit (940-63710-0010-200)
DRIVE AGX Orin Developer Kit (940-63710-0010-100)
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
DRIVE AGX Orin Developer Kit (not sure its number)
other

SDK Manager Version
2.1.0
other

Host Machine Version
native Ubuntu Linux 20.04 Host installed with SDK Manager
native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other

Issue Description
I tried the simple sample of cuda in my drive orin device and tries to use cudaMemPrefetchAsync. But even though i can detect my cuda device, it still has error of cudaMemPrefetchAsync. Did you make something wrong in my code?

$ ./vectorAdd
Device count: 1
Using device 0: Orin (CC 8.7)
[UM VectorAdd N=50000000 (190.73 MiB per array), reps=5]
vectorAdd.cu:305 invalid device ordinal

#define CHECK(x) do{ cudaError_t e=(x); if(e){fprintf(stderr,"%s:%d %s\n",__FILE__,__LINE__,cudaGetErrorString(e)); exit(1);} }while(0)

int main(int argc, char** argv){
     int count = 0;
    CHECK(cudaGetDeviceCount(&count));
    printf("Device count: %d\n", count);


    int dev = 0;  // Or parse from argv, then clamp to [0, dev_count)
    cudaSetDevice(dev);

    cudaDeviceProp prop{};
    cudaGetDeviceProperties(&prop, dev);
    printf("Using device %d: %s (CC %d.%d)\n", dev, prop.name, prop.major, prop.minor);


    // --- problem size ---
    int N    = (argc>1)? atoi(argv[1]) : 50000000; // default 50M elements
    int reps = (argc>2)? atoi(argv[2]) : 5;
    size_t bytes = (size_t)N * sizeof(float);
    printf("[UM VectorAdd N=%d (%.2f MiB per array), reps=%d]\n",
           N, bytes/1024.0/1024.0, reps);

    // --- Unified Memory allocation ---
    float *A, *B, *C;
    CHECK(cudaMallocManaged(&A, bytes));
    CHECK(cudaMallocManaged(&B, bytes));
    CHECK(cudaMallocManaged(&C, bytes));

    // --- initialize on host ---
    for (int i=0;i<N;i++){ A[i]=rand()/(float)RAND_MAX; B[i]=rand()/(float)RAND_MAX; }

    // --- prefetch to GPU 0 ---
    CHECK(cudaMemPrefetchAsync(A, bytes, dev));
    CHECK(cudaMemPrefetchAsync(B, bytes, dev));
    CHECK(cudaMemPrefetchAsync(C, bytes, dev));
    CHECK(cudaDeviceSynchronize());

     return 0;
}

SivaRamaKrishnaNV · August 28, 2025, 10:48am

Dear @rlu1 ,
Could you check if /usr/local/cuda/samples/1_Utilities/UnifiedMemoryPerf sample working and see the usage of API for reference?

rlu1 · September 3, 2025, 7:49am

okay, thanks

system · October 2, 2025, 9:03am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Error: no CUDA-capable device is detected on Nvidia Drive AGX Orin DRIVE AGX Orin General driveos-cuda	9	342	June 28, 2024
CUDA IPC Support on DRIVE AGX Orin DRIVE AGX Orin General driveos-cuda	2	711	March 29, 2023
Errors while running Drive samples DRIVE AGX Orin General driveos-cuda	9	987	July 8, 2023
CUDA_ARCH_BIN version for Drive AGX Orin DRIVE AGX Orin General driveos-cuda	3	695	March 4, 2024
Cuda version on Host pc DRIVE AGX Orin General driveos-cuda	7	497	February 26, 2024
Invalid Device Ordinal with cudaCpuDeviceId Jetson Orin NX	2	129	April 28, 2025
Unable to run sample hello world app on orin DRIVE AGX Orin General driveworks	7	994	February 10, 2023
Memory leak for CUDA runtime lib DRIVE AGX Orin General driveos-cuda	6	542	February 7, 2024
Various CUDA versions in multiple docker containers DRIVE AGX Orin General driveos-cuda	4	138	April 7, 2025
An error occurs about GPU device count when I try to run the sample DRIVE AGX Orin General driveworks	4	1133	May 26, 2023

Issue with cudaMemPrefetchAsync on drive orin device

Related topics