cudaMallocManaged not implemented?

DO_Ray · September 1, 2023, 12:40pm

cudaMallocManaged not implemented ?

It looks like cross posting , but i categorised that post not correctly. I consider the issue of grave importance that’s why i try this

AastaLLL · September 4, 2023, 3:52am

Hi,

Could you share more about your issue? Do you encounter any errors when using it?
cudaMallocManaged is available on Jetson.

Below is the Jetson CUDA memory detail for your reference:

Thanks.

DO_Ray · September 4, 2023, 9:01am

What i do know is on Jetson Nano 4Gb unified memory functions correct. On Jetson ORIN Nano it does not. I tried “Unified Memory for CUDA Beginners” because i surely am a beginner. The result i get is “segmentation fault”. Everything is fine until the square brackets.

    #include <iostream>
    #include <math.h>
     
    // CUDA kernel to add elements of two arrays
    __global__
    void add(int n, float *x, float *y)
    {
      int index = blockIdx.x * blockDim.x + threadIdx.x;
      int stride = blockDim.x * gridDim.x;
      for (int i = index; i < n; i += stride)
        y[i] = x[i] + y[i];
    }
     
    int main(void)
    {
      int N = 1<<20;
      float *x, *y;
     printf("st art");
      // Allocate Unified Memory -- accessible from CPU or GPU
      cudaMallocManaged(&x, N*sizeof(float));
      cudaMallocManaged(&y, N*sizeof(float));
     
      // initialize x and y arrays on the host
      for (int i = 0; i < N; i++) {
>> 
        x[i] = 1.0f;
        y[i] = 2.0f;
      }
     
      // Launch kernel on 1M elements on the GPU
      int blockSize = 256;
      int numBlocks = (N + blockSize - 1) / blockSize;
      add<<<numBlocks, blockSize>>>(N, x, y);
     
      // Wait for GPU to finish before accessing on host
      cudaDeviceSynchronize();
     
      // Check for errors (all values should be 3.0f)
      float maxError = 0.0f;
      for (int i = 0; i < N; i++)
        maxError = fmax(maxError, fabs(y[i]-3.0f));
      std::cout << "Max error: " << maxError << std::endl;
     
      // Free memory
      cudaFree(x);
      cudaFree(y);
     
      return 0;
    }

If you want to be baffled you could compile “m4emtest.cu” from the above post you’d get printed messages in the above one you wouldnot.

I realy don’t know why unified memory does not work, i seems that maybe CUDA technology on the Orin is to difficult for me, i am at the point to refund and buy the previous Nano.

AastaLLL · September 5, 2023, 5:44am

Hi,

The CUDA library version is different as Orin Nano is 11.4 and Nano is 10.2.
But the unified memory allocation API doesn’t change so it’s recommended your app should work on both environments.

We are going to give it a try. Will update more info with you.

Thanks.

AastaLLL · September 5, 2023, 6:16am

Hi,

We tried the sample and it worked correctly.

$ nvcc m4emtest.cu -o test
$ ./test
st artMax error: 0

Thanks.

DO_Ray · September 7, 2023, 10:59am

Command ‘nvcc’ not found, did you mean:

command ‘nvlc’ from deb vlc-bin (3.0.9.2-1)
I use cmake and gcc to compile, i missed the cuda toolkit , but even after downloading that the result is the same

DO_Ray · September 7, 2023, 11:16am

and setting export PATH=/usr/local/cuda-12.2/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64/${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

AastaLLL · September 13, 2023, 7:55am

Hi,

Please export the compat folder to run CUDA 12 on Jetson
For example:

export PATH=/usr/local/cuda-12/bin:${PATH}
export LD_LIBRARY_PATH=/usr/local/cuda-12.0/compat

Thanks.

DO_Ray · September 14, 2023, 7:35am

HA it does not work either, so i use cmake version 3.16.3, with
cmake_minimum_required(VERSION 2.8)
find_package(CUDA QUIET REQUIRED)
that doesnot work, so i tried nvcc as you suggested, . . . the same result.

Now lets make it easy on our selves, is there a image download.

(https://developer.nvidia.com/embedded/downloads)
``` does not show it, but

(https://www.youtube.com/watch?v=VWdJ4BCtam8)

tells there is.
Please supply a link to the image, then i hope all this mess is over.
cheers DO_Ray

DO_Ray · September 16, 2023, 4:35pm

So you can’t burn a SD-card with an Jetpack image for Orin Nano developer kit. You have to use sdkmanager, then you are not there yet, completely on a different path you have to install Jetpack (again ?) through apt install steps discribed somewhere in your datadungeon. After that i can tell you i don’t get a segmentation fault anymore, but i lost some data about fourier series and epicycles that i cannot find anymore on github.

If you want to help someone please ask the appropriate questions, order the data around the question " How to start the Orin Nano up" . Give me a choice what to install, i have now a whole lot of libraries about deeplearning, openCV where i am not ready for yet. I just want to get the hang on CUDA.

Thanks for reading this

AastaLLL · September 18, 2023, 5:50am

Hi,

SD card image should provide a similar environment.
Could you share which image you are using?

Thanks.

DO_Ray · September 19, 2023, 9:41am

I don’ t have a loose SD-card image , see former text all must be coordinated through sdkmanager version 1.9.3.10904
electron 13.6.9
chrome 91.0.4472.164
node.js 14.16.0
ubuntu 20.04.6 LTS
arch x86_64
jetpack 5.1.2
allthough uname says:
Linux ubuntu 5.10.120-tegra #1 SMP PREEMPT Tue Aug 1 12:32:50 PDT 2023 aarch64 aarch64 aarch64 GNU/Linux

after the initial sdkmanager i read to install jetpack (again >?<)

apt install nvidia-jetpack
apt update
apt dist-upgrade

AastaLLL · September 20, 2023, 6:00am

Hi,

Sorry for the nonclear statement.
Hope below can explain the setup more clearly:

1. Setup
There are two steps: flash OS and install components.
Flash: you can use SDKmanager or initrd flash(write Image to the microSD)
Components: you can use SDKmanager or OTA installation (apt-get install nvidia-jetpack)

So to get an environment with all the components set, there are several ways:
- Flash with SDK manager and install components with SDKmanager
- Flash via SDCard and install components with SDKmanager
- Flash via SDCard and install components through OTA

2. CUDA
The default CUDA in JetPack 5.1.2 is 11.4.
If you manually upgrade the CUDA to a newer version with this link, please export the compat folder to ensure compatibility.
Please note that you will need to use the package for aarch64-jetson.
Other Debian cannot work on the Jetson platform.

3. NVCC
nvcc is under /usr/local/cuda/bin/.
If you cannot find it by default, please export the below two global variables:

export PATH=/usr/local/cuda-11.4/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64:$LD_LIBRARY_PATH

Thanks.

Topic		Replies	Views
Demo delivers segmentation fault System Management and Monitoring (NVML) cuda	2	715	August 26, 2023
Unified Memory not reachable by GPU? Jetson Nano cuda	2	727	October 27, 2021
Unified memory with CUDA on Jetson Nano needs memcpy? Jetson Nano cuda	8	2530	October 23, 2020
CUDA 6: Simplest Sample Segmentation Fault CUDA Programming and Performance	10	5215	March 26, 2015
cudaMalloc illegal memory access on orin nano Jetson Orin Nano cuda	3	309	June 28, 2024
simple CUDA multi-threading crash on Nano Jetson Nano	14	3542	January 29, 2020
Unified memory for NPP library Jetson Nano cuda	1	462	February 16, 2022
cudaMallocManaged error on my machine CUDA Programming and Performance	3	3949	October 23, 2014
Unified memory CUDA Programming and Performance	3	472	November 17, 2018
Jetson Orin Nano 8GB + Cuda - Driver Incompatibility Jetson Orin Nano cuda , ubuntu , driver	2	1333	June 21, 2023

cudaMallocManaged not implemented?

Related topics