Simple Code Crashing with 18.4 (was working with 17.10)

Hi,

I have the simple code:

      call initialize_flow
!$acc enter data copyin(v_flow_r,v_flow_t,v_flow_p)

where the three arrays are contained in an included module, and the called routine allocates them and initializes their values on the CPU. These arrays are not created/copied anywhere else of the GPU.

The code compiles fine, but when I run it I get:

FATAL ERROR: variable in data clause is partially present on the device: name=v_flow_t

If I run the code using PGI 17.10 (and I think 18.1,18.3) it also works fine.

I also tried creating the arrays in the routine itself but it still fails.

I put the debug flags on and got:

pgi_uacc_dataenterstart( file=/home/sumseq/Dropbox/PSI/MAS/MAS_SVN/branches/mas_openacc/mas_sed_expmac.f, function=start, line=4771:4771, line=5255, devid=0 )
pgi_uacc_dataon(hostptr=0x4fcb190,stride=1,267,size=267x246,eltsize=8,lineno=5255,name=v_flow_r,flags=0x20700=present+create+copyin+dynamic,async=-1,threadid=1)
pgi_uacc_alloc(size=525456,devid=1,threadid=1)
allocate device memory 0x7f1c1af3da00(525824B)
pgi_uacc_alloc(size=525456,devid=1,threadid=1) returns 0x7f1c1af3da00
map    dev:0x7f1c1af3da00 host:0x4fcb190 dindex:1 size:525456 offset:0  (line:5255 name:v_flow_r) thread:1
alloc done with devptr at 0x7f1c1af3da00
pgi_uacc_alloc(size=96,devid=1,threadid=1)
recycle device memory 0x7f1ea5c07200(512B)
pgi_uacc_alloc(size=96,devid=1,threadid=1) returns 0x7f1ea5c07200
map    dev:0x7f1ea5c07200 host:0x10f67b0 dindex:1 size:96 offset:0  (line:5255 name:descriptor) thread:1
alloc done with devptr at 0x7f1ea5c07200
pgi_uacc_dataupx(devptr=0x7f1ea5c07200,hostptr=0x10f67b0,eltsize=96,lineno=5255,name=descriptor,async=-1,threadid=1)
pgi_uacc_cuda_dataup1(devdst=0x7f1ea5c07200,hostsrc=0x10f67b0,offset=0,stride=1,size=1,eltsize=96,lineno=5255,name=descriptor,thread=1)
pgi_uacc_dataupx(devptr=0x7f1c1af3da00,hostptr=0x4fcb190,stride=1,size=65682,eltsize=8,lineno=5255,name=v_flow_r,async=-1,threadid=1)
pgi_uacc_cuda_dataup1(devdst=0x7f1c1af3da00,hostsrc=0x4fcb190,offset=0,stride=1,size=65682,eltsize=8,lineno=5255,name=v_flow_r,thread=1)
pgi_uacc_dataon(hostptr=0x53df480,stride=1,266,size=266x246,eltsize=8,lineno=5255,name=v_flow_t,flags=0x20700=present+create+copyin+dynamic,async=-1,threadid=1)
v_flow_t lives at 0x53df480 size 523488 partially present

It seems to have copied the v_flow_r fine but fails on the v_flow_t which is bizarre.
I tried putting these copies on separate lines and it still does the same thing.
This is a pretty big snag since 18.4 is the community edition…
Any Ideas?

  • Ron

SOLVED!

It turns out that in a completely unrelated piece of code I had accedently put an “!$acc exit data delete” AFTER a “deallocate()” for an array (not the one in the error).

I guess this messes up the memory in a weird way that exhibited itself as that error. Still strange, but FIXED!

Hi Ron,

Did you mean to say that you forgot to put in a exit data?

If so, then this would make sense since if you allocated v_flow_t after this since the OS may be reusing the same host data. Then when the compiler looks up the host address, it sees that the previous address overlaps with the new one.


Otherwise, I’m not sure why removing an exit data directive would fix this issue.

-Mat

Hi,

I did not remove an “exit data”.

What happened was that I had a “deallocate(a)” followed by “acc exit data(a)” instead of the other way around.

I think that if someone exists an array from the device after it is deallocated on the host, the runtime should throw some kind of error.

Instead it seems to have caused some memory corruption that exhibited itself in the partially-present error on a totally unrelated array.

  • Ron