cudaMallocManaged not implemented ?
It looks like cross posting , but i categorised that post not correctly. I consider the issue of grave importance that’s why i try this
cudaMallocManaged not implemented ?
It looks like cross posting , but i categorised that post not correctly. I consider the issue of grave importance that’s why i try this
Hi,
Could you share more about your issue? Do you encounter any errors when using it?
cudaMallocManaged is available on Jetson.
Below is the Jetson CUDA memory detail for your reference:
Thanks.
What i do know is on Jetson Nano 4Gb unified memory functions correct. On Jetson ORIN Nano it does not. I tried “Unified Memory for CUDA Beginners” because i surely am a beginner. The result i get is “segmentation fault”. Everything is fine until the square brackets.
#include <iostream>
#include <math.h>
// CUDA kernel to add elements of two arrays
__global__
void add(int n, float *x, float *y)
{
int index = blockIdx.x * blockDim.x + threadIdx.x;
int stride = blockDim.x * gridDim.x;
for (int i = index; i < n; i += stride)
y[i] = x[i] + y[i];
}
int main(void)
{
int N = 1<<20;
float *x, *y;
printf("st art");
// Allocate Unified Memory -- accessible from CPU or GPU
cudaMallocManaged(&x, N*sizeof(float));
cudaMallocManaged(&y, N*sizeof(float));
// initialize x and y arrays on the host
for (int i = 0; i < N; i++) {
>>
x[i] = 1.0f;
y[i] = 2.0f;
}
// Launch kernel on 1M elements on the GPU
int blockSize = 256;
int numBlocks = (N + blockSize - 1) / blockSize;
add<<<numBlocks, blockSize>>>(N, x, y);
// Wait for GPU to finish before accessing on host
cudaDeviceSynchronize();
// Check for errors (all values should be 3.0f)
float maxError = 0.0f;
for (int i = 0; i < N; i++)
maxError = fmax(maxError, fabs(y[i]-3.0f));
std::cout << "Max error: " << maxError << std::endl;
// Free memory
cudaFree(x);
cudaFree(y);
return 0;
}
If you want to be baffled you could compile “m4emtest.cu” from the above post you’d get printed messages in the above one you wouldnot.
I realy don’t know why unified memory does not work, i seems that maybe CUDA technology on the Orin is to difficult for me, i am at the point to refund and buy the previous Nano.
Hi,
The CUDA library version is different as Orin Nano is 11.4 and Nano is 10.2.
But the unified memory allocation API doesn’t change so it’s recommended your app should work on both environments.
We are going to give it a try. Will update more info with you.
Thanks.
Hi,
We tried the sample and it worked correctly.
$ nvcc m4emtest.cu -o test
$ ./test
st artMax error: 0
Thanks.
Command ‘nvcc’ not found, did you mean:
command ‘nvlc’ from deb vlc-bin (3.0.9.2-1)
I use cmake and gcc to compile, i missed the cuda toolkit , but even after downloading that the result is the same
and setting export PATH=/usr/local/cuda-12.2/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64/${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
Hi,
Please export the compat folder to run CUDA 12 on Jetson
For example:
export PATH=/usr/local/cuda-12/bin:${PATH}
export LD_LIBRARY_PATH=/usr/local/cuda-12.0/compat
Thanks.
HA it does not work either, so i use cmake version 3.16.3, with
cmake_minimum_required(VERSION 2.8)
find_package(CUDA QUIET REQUIRED)
that doesnot work, so i tried nvcc as you suggested, . . . the same result.
Now lets make it easy on our selves, is there a image download.
(https://developer.nvidia.com/embedded/downloads)
``` does not show it, but
(https://www.youtube.com/watch?v=VWdJ4BCtam8)
tells there is.
Please supply a link to the image, then i hope all this mess is over.
cheers DO_Ray
So you can’t burn a SD-card with an Jetpack image for Orin Nano developer kit. You have to use sdkmanager, then you are not there yet, completely on a different path you have to install Jetpack (again ?) through apt install steps discribed somewhere in your datadungeon. After that i can tell you i don’t get a segmentation fault anymore, but i lost some data about fourier series and epicycles that i cannot find anymore on github.
If you want to help someone please ask the appropriate questions, order the data around the question " How to start the Orin Nano up" . Give me a choice what to install, i have now a whole lot of libraries about deeplearning, openCV where i am not ready for yet. I just want to get the hang on CUDA.
Thanks for reading this
Hi,
SD card image should provide a similar environment.
Could you share which image you are using?
Thanks.
I don’ t have a loose SD-card image , see former text all must be coordinated through sdkmanager version 1.9.3.10904
electron 13.6.9
chrome 91.0.4472.164
node.js 14.16.0
ubuntu 20.04.6 LTS
arch x86_64
jetpack 5.1.2
allthough uname says:
Linux ubuntu 5.10.120-tegra #1 SMP PREEMPT Tue Aug 1 12:32:50 PDT 2023 aarch64 aarch64 aarch64 GNU/Linux
after the initial sdkmanager i read to install jetpack (again >?<)
apt install nvidia-jetpack
apt update
apt dist-upgrade
Hi,
Sorry for the nonclear statement.
Hope below can explain the setup more clearly:
1. Setup
There are two steps: flash OS and install components.
Flash: you can use SDKmanager or initrd flash(write Image to the microSD)
Components: you can use SDKmanager or OTA installation (apt-get install nvidia-jetpack
)
So to get an environment with all the components set, there are several ways:
- Flash with SDK manager and install components with SDKmanager
- Flash via SDCard and install components with SDKmanager
- Flash via SDCard and install components through OTA
2. CUDA
The default CUDA in JetPack 5.1.2 is 11.4.
If you manually upgrade the CUDA to a newer version with this link, please export the compat folder to ensure compatibility.
Please note that you will need to use the package for aarch64-jetson
.
Other Debian cannot work on the Jetson platform.
3. NVCC
nvcc is under /usr/local/cuda/bin/
.
If you cannot find it by default, please export the below two global variables:
export PATH=/usr/local/cuda-11.4/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64:$LD_LIBRARY_PATH
Thanks.