I tried the example code in https://www.pgroup.com/lit/articles/insider/v6n1a2.htm but got an error: Bus error (core dumped).
The code is here:
integer :: num_cm
integer, allocatable, device :: fine(:)
real, allocatable, device :: mat_matrix(:,:,:)
real, allocatable, device :: src_matrix(:,:,:)
type(tCM), allocatable, managed :: cm_list(:)
integer(kind=cuda_stream_kind) :: str_mng, str_dev
istat = cudaStreamCreate(str_mng)
istat = cudaStreamCreate(str_dev)
istat = cudaforSetDefaultStream(cm_list, str_mng)
do i=1, num_cm
istat = cudaforSetDefaultStream(cm_list(i)%fine, str_dev)
Thanks for your attention!
What type of system are you running on? Usually when you use managed memory and get a bus error, it means the host and device is accessing managed memory at the same time.
The hardware is Tesla K80, and the compiler platform is PrgEnv-pgi/16.1. Thanks!
Brent’s intuition is correct. For older cards such as a K80, if managed memory is accessed from both the host and device at the same time, you’ll get a bus error. The same issue does not occur on new P100 or V100 devices.
Here, the problem line is “cm_list(i)%fine=10” since you’re accessing managed memory on the host via “cm_list(i)”, but also accessing it on the device when assigning to device array "fine.
The simple solution is to make the “device” arrays “managed”. This will have the effect of updating the array on the host, but have them accessible on the device.
Thanks guys for the clarification!