Cupy crashes on Jetson Nano

linhdh · February 23, 2021, 6:18am

Hello,

I installed Cupy v8.4 on Jetson Nano by command:

pip install Cupy

then Cupy was installed from source code successful after a long run
then i tested Cupy by trying to import Cupy or Numpy:

import numpy as np

or

import cupy as cp

but i had following error message:

Illegal instruction (core dumped)

Can anyone help me fix this error?

ShaneCCC · February 23, 2021, 7:38am

You may need post whole error message for someone have this kind experience to check it.

linhdh · February 23, 2021, 7:52am

the whole message is:
Illegal instruction (core dumped)

dusty_nv · February 23, 2021, 2:45pm

Hi @linhdh if you upgraded pip, it may have also upgraded numpy, and there is a bug in numpy v1.19.5: Illegal instruction (core dumped) on import for numpy 1.19.5 on ARM64 · Issue #18131 · numpy/numpy · GitHub

As a workaround, export OPENBLAS_CORETYPE=ARMV8 first before running your Python script.

linhdh · February 24, 2021, 5:56am

@dusty_nv Thank you.
your guide help me fix this bug.

easybob · April 24, 2021, 4:31pm

Hello,

did someone test cupy with Jeton Nano or Xavier NX ? I mean, is cupy really interesting ? I made some tests some times ago on my laptop and cupy was not fastest than numpy.

Alain

dkreutz · April 24, 2021, 7:14pm

I did experiments with CuPy trying to speed up some audio processing where a lot of FFT is involved.
Worked so far and speed increase was noticable. You need to check all your code in order to avoid frequently shifting data between CPU and GPU.

In my specific case a lot of changes were necessary as not only Numpy but other math-libs like SciPy and LibRosa were involved and you will have to rewrite that code and try to replace it with some NumPy functionality.

easybob · April 25, 2021, 10:33am

I see. Not really easy to use GPU with good efficiency.

I had the same problems.

Shifting data between CPU & GPU cost time. It’s the weak point. New Nvidia Grace will bring some real improvements i guess.

Not sure Orin will bring really significant improvement but i hope so.

I won’t have time to modify my software to try to get CuPy improvements. Too bad.

Have a nice day.

Alain

dusty_nv · April 26, 2021, 2:02pm

Jetson Nano/TX1/TX2/Xavier already support CUDA mapped memory and CUDA managed memory, where no CPU/GPU copy is required (because it shares the same physical memory). For example, in jetson-inference I am always using this cudaAllocMapped() wrapper that allocates shared CPU/GPU memory (and as a result I never need to do cudaMemcpy())

However it seems a limitation of CuPy that it doesn’t support these types of memory, so it does the copy anyways.

easybob · April 26, 2021, 3:16pm

Hello Dustin,

what you say looks interesting !

For now, i do this :

import pycuda.driver as drv

img_r_gpu1= drv.mem_alloc(res_r1.size * res_r1.dtype.itemsize)
drv.memcpy_htod(img_r_gpu1, res_r1)

i call my PyCuda function

drv.memcpy_dtoh(res_r1, r_gpu)
img_r_gpu1.free()

So, this would be the bad method ?

Alain

dusty_nv · April 26, 2021, 3:37pm

That is performing an explicit memory copy between CPU and GPU. Typically a GPU has it’s own discrete memory because it is hooked up via PCIe, so it would require this memory copy from system RAM. However on Jetson, all of the memory between CPU/GPU is shared. So if the memory is allocated as ‘CUDA mapped memory’ (aka zero-copy) or ‘CUDA managed’ memory, you don’t need to do the memory copies. But unfortunately I can’t see where CuPy supports allocation of this CUDA mapped memory or CUDA managed memory.

easybob · April 26, 2021, 3:55pm

Ok.

I will try to find some informations. If i can avoid all this wasting time (copying from host to device and vice versa), it could be great.

Thx Dustin, the man who never sleep like the big apple !

Alain

easybob · April 29, 2021, 1:58pm

Hello Dustin,

could this be an illustration of what you are talking about, but this time using cupy :

import pycuda.autoinit # noqa
from pycuda.compiler import SourceModule

import cupy as cp

// Create a CuPy array (and a copy for comparison later)
cupy_a = cp.random.randn(4, 4).astype(cp.float32)
original = cupy_a.copy()

// Create a kernel
mod = SourceModule(“”"
global void doublify(float a)
{
int idx = threadIdx.x + threadIdx.y4;
a[idx] *= 2;
}
“”")

func = mod.get_function(“doublify”)

// Invoke PyCUDA kernel on a CuPy array
func(cupy_a, block=(4, 4, 1), grid=(1, 1), shared=0)

// Demonstrate that our CuPy array was modified in place by the PyCUDA kernel
print(“original array:”)
print(original)
print(“doubled with kernel:”)
print(cupy_a)

Alain

Topic		Replies	Views
Numpy throwing illegal instruction(core dumped) error Jetson Xavier NX tensorflow	3	4802	September 27, 2021
CuPy and Jetson Nano Jetson Nano	10	5338	October 14, 2021
Illegal instruction (core dumped) Jetson TX2 python	15	27116	July 21, 2021
Jetpack 4.5 l4t 32.5.1 core dump at python3 'import' command in Jetson NANO Jetson Nano python	4	776	November 4, 2021
Jetson Nano Pygame Illegal instruction (core dumped) Jetson Nano linux	2	864	October 30, 2021
Import Cupy throws AttributeError on 'numpy' CUDA Programming and Performance cupy	1	1431	June 23, 2023
Opencv woes Jetson Nano opencv	4	638	October 15, 2021
Sudo python3 setup.py install : Illegal instruction Jetson Nano python , nano2gb	2	3828	October 15, 2021
" Illegal instruction (core dumped)" Xavier Jetson AGX Xavier python	7	13471	October 18, 2021
Illegal instruction jetson nano - yolov8 Jetson Nano yolo	4	1099	January 3, 2024

Cupy crashes on Jetson Nano

Related topics