cudaMallocManaged not implemented?

cudaMallocManaged not implemented ?

It looks like cross posting , but i categorised that post not correctly. I consider the issue of grave importance that’s why i try this

Hi,

Could you share more about your issue? Do you encounter any errors when using it?
cudaMallocManaged is available on Jetson.

Below is the Jetson CUDA memory detail for your reference:

Thanks.

What i do know is on Jetson Nano 4Gb unified memory functions correct. On Jetson ORIN Nano it does not. I tried “Unified Memory for CUDA Beginners” because i surely am a beginner. The result i get is “segmentation fault”. Everything is fine until the square brackets.

    #include <iostream>
    #include <math.h>
     
    // CUDA kernel to add elements of two arrays
    __global__
    void add(int n, float *x, float *y)
    {
      int index = blockIdx.x * blockDim.x + threadIdx.x;
      int stride = blockDim.x * gridDim.x;
      for (int i = index; i < n; i += stride)
        y[i] = x[i] + y[i];
    }
     
    int main(void)
    {
      int N = 1<<20;
      float *x, *y;
     printf("st art");
      // Allocate Unified Memory -- accessible from CPU or GPU
      cudaMallocManaged(&x, N*sizeof(float));
      cudaMallocManaged(&y, N*sizeof(float));
     
      // initialize x and y arrays on the host
      for (int i = 0; i < N; i++) {
>> 
        x[i] = 1.0f;
        y[i] = 2.0f;
      }
     
      // Launch kernel on 1M elements on the GPU
      int blockSize = 256;
      int numBlocks = (N + blockSize - 1) / blockSize;
      add<<<numBlocks, blockSize>>>(N, x, y);
     
      // Wait for GPU to finish before accessing on host
      cudaDeviceSynchronize();
     
      // Check for errors (all values should be 3.0f)
      float maxError = 0.0f;
      for (int i = 0; i < N; i++)
        maxError = fmax(maxError, fabs(y[i]-3.0f));
      std::cout << "Max error: " << maxError << std::endl;
     
      // Free memory
      cudaFree(x);
      cudaFree(y);
     
      return 0;
    }

If you want to be baffled you could compile “m4emtest.cu” from the above post you’d get printed messages in the above one you wouldnot.

I realy don’t know why unified memory does not work, i seems that maybe CUDA technology on the Orin is to difficult for me, i am at the point to refund and buy the previous Nano.

Hi,

The CUDA library version is different as Orin Nano is 11.4 and Nano is 10.2.
But the unified memory allocation API doesn’t change so it’s recommended your app should work on both environments.

We are going to give it a try. Will update more info with you.

Thanks.

Hi,

We tried the sample and it worked correctly.

$ nvcc m4emtest.cu -o test
$ ./test
st artMax error: 0

Thanks.

Command ‘nvcc’ not found, did you mean:

command ‘nvlc’ from deb vlc-bin (3.0.9.2-1)
I use cmake and gcc to compile, i missed the cuda toolkit , but even after downloading that the result is the same

and setting export PATH=/usr/local/cuda-12.2/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64/${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Hi,

Please export the compat folder to run CUDA 12 on Jetson
For example:

export PATH=/usr/local/cuda-12/bin:${PATH}
export LD_LIBRARY_PATH=/usr/local/cuda-12.0/compat

Thanks.

HA it does not work either, so i use cmake version 3.16.3, with
cmake_minimum_required(VERSION 2.8)
find_package(CUDA QUIET REQUIRED)
that doesnot work, so i tried nvcc as you suggested, . . . the same result.

Now lets make it easy on our selves, is there a image download.

(https://developer.nvidia.com/embedded/downloads)
``` does not show it, but 

(https://www.youtube.com/watch?v=VWdJ4BCtam8)

tells there is.
Please supply a link to the image, then i hope all this mess is over.
cheers DO_Ray

So you can’t burn a SD-card with an Jetpack image for Orin Nano developer kit. You have to use sdkmanager, then you are not there yet, completely on a different path you have to install Jetpack (again ?) through apt install steps discribed somewhere in your datadungeon. After that i can tell you i don’t get a segmentation fault anymore, but i lost some data about fourier series and epicycles that i cannot find anymore on github.

If you want to help someone please ask the appropriate questions, order the data around the question " How to start the Orin Nano up" . Give me a choice what to install, i have now a whole lot of libraries about deeplearning, openCV where i am not ready for yet. I just want to get the hang on CUDA.

Thanks for reading this

Hi,

SD card image should provide a similar environment.
Could you share which image you are using?

Thanks.

I don’ t have a loose SD-card image , see former text all must be coordinated through sdkmanager version 1.9.3.10904
electron 13.6.9
chrome 91.0.4472.164
node.js 14.16.0
ubuntu 20.04.6 LTS
arch x86_64
jetpack 5.1.2
allthough uname says:
Linux ubuntu 5.10.120-tegra #1 SMP PREEMPT Tue Aug 1 12:32:50 PDT 2023 aarch64 aarch64 aarch64 GNU/Linux

after the initial sdkmanager i read to install jetpack (again >?<)

apt install nvidia-jetpack
apt update
apt dist-upgrade

Hi,

Sorry for the nonclear statement.
Hope below can explain the setup more clearly:

1. Setup
There are two steps: flash OS and install components.
Flash: you can use SDKmanager or initrd flash(write Image to the microSD)
Components: you can use SDKmanager or OTA installation (apt-get install nvidia-jetpack)

So to get an environment with all the components set, there are several ways:
- Flash with SDK manager and install components with SDKmanager
- Flash via SDCard and install components with SDKmanager
- Flash via SDCard and install components through OTA

2. CUDA
The default CUDA in JetPack 5.1.2 is 11.4.
If you manually upgrade the CUDA to a newer version with this link, please export the compat folder to ensure compatibility.
Please note that you will need to use the package for aarch64-jetson.
Other Debian cannot work on the Jetson platform.

3. NVCC
nvcc is under /usr/local/cuda/bin/.
If you cannot find it by default, please export the below two global variables:

export PATH=/usr/local/cuda-11.4/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64:$LD_LIBRARY_PATH

Thanks.