PyCuda Error in Jetson nano

Hi everyone, I need your help, I’m new using a jetson nano and I’m trying to use pycuda in this.

[b]from pycuda.compiler import SourceModule

ImportError: No module named compiler[/b]

Hi,

Are you using python3?

Here is the instruction to get pyCuda installed with pip3:

$ sudo apt-get install python3-pip
$ pip3 install numpy pycuda --user

We can import pycuda.compiler without issue with the above installation.

Thanks.

Hi, AastaLLL

I’m using python3, and the installation correctly ends, in the importation of principal library no problem, but in the importation of Compiler o driver catch the error

This code works, but I’m trying to create new codes with pycuda.compiler this generates the error

import pycuda
import pycuda.driver as drv
drv.init()
print('CUDA device query (PyCUDA version) \n')
print('Detected {} CUDA Capable device(s) \n'.format(drv.Device.count()))
for i in range(drv.Device.count()):
    
    gpu_device = drv.Device(i)
    print('Device {}: {}'.format( i, gpu_device.name() ) )
    compute_capability = float( '%d.%d' % gpu_device.compute_capability() )
    print('\t Compute Capability: {}'.format(compute_capability))
    print('\t Total Memory: {} megabytes'.format(gpu_device.total_memory()//(1024**2)))
    
    # The following will give us all remaining device attributes as seen 
    # in the original deviceQuery.
    # We set up a dictionary as such so that we can easily index
    # the values using a string descriptor.
    
    device_attributes_tuples = gpu_device.get_attributes().items() 
    device_attributes = {}
    
    for k, v in device_attributes_tuples:
        device_attributes[str(k)] = v
    
    num_mp = device_attributes['MULTIPROCESSOR_COUNT']
    
    # Cores per multiprocessor is not reported by the GPU!  
    # We must use a lookup table based on compute capability.
    # See the following:
    # http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities
    
    cuda_cores_per_mp = { 5.0 : 128, 5.1 : 128, 5.2 : 128, 5.3 : 128, 6.0 : 64, 6.1 : 128, 6.2 : 128}[compute_capability]
    
    print('\t ({}) Multiprocessors, ({}) CUDA Cores / Multiprocessor: {} CUDA Cores'.format(num_mp, cuda_cores_per_mp, num_mp*cuda_cores_per_mp))
    
    device_attributes.pop('MULTIPROCESSOR_COUNT')
    
    for k in device_attributes.keys():
        print('\t {}: {}'.format(k, device_attributes[k]))
CUDA device query (PyCUDA version) 

Detected 1 CUDA Capable device(s) 

Device 0: NVIDIA Tegra X1
	 Compute Capability: 5.3
	 Total Memory: 3956 megabytes
	 (1) Multiprocessors, (128) CUDA Cores / Multiprocessor: 128 CUDA Cores
	 ASYNC_ENGINE_COUNT: 1
	 CAN_MAP_HOST_MEMORY: 1
	 CLOCK_RATE: 921600
	 COMPUTE_CAPABILITY_MAJOR: 5
	 COMPUTE_CAPABILITY_MINOR: 3
	 COMPUTE_MODE: DEFAULT
	 CONCURRENT_KERNELS: 1
	 ECC_ENABLED: 0
	 GLOBAL_L1_CACHE_SUPPORTED: 1
	 GLOBAL_MEMORY_BUS_WIDTH: 64
	 GPU_OVERLAP: 1
	 INTEGRATED: 1
	 KERNEL_EXEC_TIMEOUT: 1
	 L2_CACHE_SIZE: 262144
	 LOCAL_L1_CACHE_SUPPORTED: 1
	 MANAGED_MEMORY: 1
	 MAXIMUM_SURFACE1D_LAYERED_LAYERS: 2048
	 MAXIMUM_SURFACE1D_LAYERED_WIDTH: 16384
	 MAXIMUM_SURFACE1D_WIDTH: 16384
	 MAXIMUM_SURFACE2D_HEIGHT: 65536
	 MAXIMUM_SURFACE2D_LAYERED_HEIGHT: 16384
	 MAXIMUM_SURFACE2D_LAYERED_LAYERS: 2048
	 MAXIMUM_SURFACE2D_LAYERED_WIDTH: 16384
	 MAXIMUM_SURFACE2D_WIDTH: 65536
	 MAXIMUM_SURFACE3D_DEPTH: 4096
	 MAXIMUM_SURFACE3D_HEIGHT: 4096
	 MAXIMUM_SURFACE3D_WIDTH: 4096
	 MAXIMUM_SURFACECUBEMAP_LAYERED_LAYERS: 2046
	 MAXIMUM_SURFACECUBEMAP_LAYERED_WIDTH: 16384
	 MAXIMUM_SURFACECUBEMAP_WIDTH: 16384
	 MAXIMUM_TEXTURE1D_LAYERED_LAYERS: 2048
	 MAXIMUM_TEXTURE1D_LAYERED_WIDTH: 16384
	 MAXIMUM_TEXTURE1D_LINEAR_WIDTH: 134217728
	 MAXIMUM_TEXTURE1D_MIPMAPPED_WIDTH: 16384
	 MAXIMUM_TEXTURE1D_WIDTH: 65536
	 MAXIMUM_TEXTURE2D_ARRAY_HEIGHT: 16384
	 MAXIMUM_TEXTURE2D_ARRAY_NUMSLICES: 2048
	 MAXIMUM_TEXTURE2D_ARRAY_WIDTH: 16384
	 MAXIMUM_TEXTURE2D_GATHER_HEIGHT: 16384
	 MAXIMUM_TEXTURE2D_GATHER_WIDTH: 16384
	 MAXIMUM_TEXTURE2D_HEIGHT: 65536
	 MAXIMUM_TEXTURE2D_LINEAR_HEIGHT: 65536
	 MAXIMUM_TEXTURE2D_LINEAR_PITCH: 1048544
	 MAXIMUM_TEXTURE2D_LINEAR_WIDTH: 65536
	 MAXIMUM_TEXTURE2D_MIPMAPPED_HEIGHT: 16384
	 MAXIMUM_TEXTURE2D_MIPMAPPED_WIDTH: 16384
	 MAXIMUM_TEXTURE2D_WIDTH: 65536
	 MAXIMUM_TEXTURE3D_DEPTH: 4096
	 MAXIMUM_TEXTURE3D_DEPTH_ALTERNATE: 16384
	 MAXIMUM_TEXTURE3D_HEIGHT: 4096
	 MAXIMUM_TEXTURE3D_HEIGHT_ALTERNATE: 2048
	 MAXIMUM_TEXTURE3D_WIDTH: 4096
	 MAXIMUM_TEXTURE3D_WIDTH_ALTERNATE: 2048
	 MAXIMUM_TEXTURECUBEMAP_LAYERED_LAYERS: 2046
	 MAXIMUM_TEXTURECUBEMAP_LAYERED_WIDTH: 16384
	 MAXIMUM_TEXTURECUBEMAP_WIDTH: 16384
	 MAX_BLOCK_DIM_X: 1024
	 MAX_BLOCK_DIM_Y: 1024
	 MAX_BLOCK_DIM_Z: 64
	 MAX_GRID_DIM_X: 2147483647
	 MAX_GRID_DIM_Y: 65535
	 MAX_GRID_DIM_Z: 65535
	 MAX_PITCH: 2147483647
	 MAX_REGISTERS_PER_BLOCK: 32768
	 MAX_REGISTERS_PER_MULTIPROCESSOR: 65536
	 MAX_SHARED_MEMORY_PER_BLOCK: 49152
	 MAX_SHARED_MEMORY_PER_MULTIPROCESSOR: 65536
	 MAX_THREADS_PER_BLOCK: 1024
	 MAX_THREADS_PER_MULTIPROCESSOR: 2048
	 MEMORY_CLOCK_RATE: 12750
	 MULTI_GPU_BOARD: 0
	 MULTI_GPU_BOARD_GROUP_ID: 0
	 PCI_BUS_ID: 0
	 PCI_DEVICE_ID: 0
	 PCI_DOMAIN_ID: 0
	 STREAM_PRIORITIES_SUPPORTED: 1
	 SURFACE_ALIGNMENT: 512
	 TCC_DRIVER: 0
	 TEXTURE_ALIGNMENT: 512
	 TEXTURE_PITCH_ALIGNMENT: 32
	 TOTAL_CONSTANT_MEMORY: 65536
	 UNIFIED_ADDRESSING: 1
	 WARP_SIZE: 32
Traceback (most recent call last):
  File "/home/ygb/Desktop/Pycuda-2.py", line 3, in <module>
    import pycuda.driver as drv
  File "/home/ygb/Desktop/pycuda.py", line 6, in <module>
    from pycuda.compiler import SourceModule
ModuleNotFoundError: No module named 'pycuda.compiler'; 'pycuda' is not a package

I solved my problem by adding the following lines in the prompt

gedit ~/.bashrc
# Add the following at the end of the file:
export PATH="/usr/local/cuda/bin:${PATH}"
export LD_LIBRARY_PATH="/usr/local/cuda/lib64:${LD_LIBRARY_PATH}"
export CPATH=$CPATH:/usr/local/cuda-10.0/targets/aarch64-linux/include
export LIBRARY_PATH=$LIBRARY_PATH:/usr/local/cuda-10.0/targets/aarch64-linux/lib