copyin Memcpy

There is a way to know the memory usage of a program? I mean local and global memory usage on the GPU. I finally compiled the accelerated code but I get the following error.

0: copyin Memcpy (dev=0x3aa86214e0, host=0x2e7a740, size=800

The GPU module can be found in the following link.


I found that the memory usage of the GPU can be seen using the command


I can see that I’m using all the GPU memory, so the new executions of the program have not enought space. The problem is that I cannot kill the old processes.

This doesn’t neccessarily mean that you ran out of memory. Typically, you’d get an “out-of-memory” error for that. This could be any memory error (such as writting out of bounds of an array) or even a problem with a crashed kernel before the copy.

I’d add error checking after your kernel launch to see if it crashed.

         call chemical_reaction<<<1,100>>>(y_d,enth_d,temp_d,rho_d,z_d
     &                          ,method,dt,imech, isolver,GasCon,Press
     &                                    ,Coef_d,mol_w_d)

          ir = cudaGetLastError()
          if( ir ) print *, cudaGetErrorString( ir )

If you can’t determine the cause, please post the dependent module files and I’ll try an recreate the error.

  • Mat

My original program contains three data modules which are used along the program. In order to accelerate the code, I have written some of the functions with the attribute device. Therefore, to give acces to these data to these device code I write equivalent modules with device data and a function to copy from host to device. But I get the following error when I execute the program.

0: copyin Memcpy (dev=0x3aa86214e0, host=0x1343600, size=800) FAILED: 30(unknown error)

What I did is comment all the functions and only deal with the data transactions. The error apears when I uncomment the following module.

        module rcce_mod_d
        use cudafor
        implicit none

!     Number of Constraints
        integer, device :: N_c_d
!     Number of Kinetic Constraints
        integer, device :: N_kc_d
        integer, device :: N_kc_global_d

!     Constraints matrix
        double precision, device, allocatable :: Cn_d(:,:)
        integer, device, allocatable :: iconstraint_d(:)

        end module rcce_mod_d

and I what to use it like that:

       module rcce_mod
!     *****************************************************************                 

      implicit none


!     Number of Constraints
      integer :: N_c                                                                    
!     Number of Kinetic Constraints
      integer :: N_kc
      integer :: N_kc_global
!     Constraints matrix                                                                
      double precision, allocatable :: Cn(:,:)

      integer, allocatable :: iconstraint(:)

      end module rcce_mod
       subroutine init_rcce_device(N_c,N_sp)
                use rcce_mod
                use rcce_mod_d
        end subroutine init_rcce_device

I’m compiling with -Mcuda=rdc
Many thanks

Hi ElMaskina,

Can you post or send a small reproducing example to PGI Customer Service ( Since it’s a run time error it’s hard to tell the issue without recreating it here.

Note that “rdc” is very new and we’ve found a few bugs that we’re in the process of addressing. It’s possilble that you’re encounter one of these issues. We’ll know more once we can reproduce the problem.