ACC example: 18.10 OK, however 19.4 failed to execute

Platform: Windows 10 home
Device: GTX 1060 (in an Alienware laptop)
Compiler: PGI Community Edition 18.10 and 19.4
Situation: The code can be compiled well by 18.10 and 19.4. The 18.10 exe can run correctly, however, the 19.4 compiled exe can not run.
Code:

# include <stdlib.h> 
# include <stdio.h> 

int main()
{ 
    int i, j, M; 
	float **a;

	M = 4;

    a = (float**)malloc(sizeof(float*) * M);
	#pragma acc enter data create(a[0:M][0:1])
	for (i = 0; i < M; i++)
	{
		if (i > 0)
		{
			a[i] = (float*)malloc(i * sizeof(float));
			#pragma acc enter data create(a[i:1][0:i])
		}
	}

    // Initial value 
    for (i = 0; i < M; i++)
	{ 
		if (i > 0)
		{
			for (j = 0; j < i; j++)
			{ 
				a[i][j] = 1000.0f; 
			}    
			#pragma acc update device(a[i:1][0:i])
		}
    }    
       
	#pragma acc parallel loop present(a)
	for (i = 0; i < M; i++)
	{ 
		if (i > 0)
		{
			for (j = 0; j < i; j++)
			{
				a[i][j] += i * 10 + j; 
			}
		}
	} 
    
	for (i = 0; i < M; i++)
	{ 
		if (i > 0)
		{
			#pragma acc update host(a[i:1][0:i])
			for (j = 0; j < i; j++)
			{ 
				printf("%f   ", a[i][j]); 
			}    
			printf("\n");
		}
		else
		{
			printf("NULL\n");
		}
    }    

	for (i = 0; i < M; i++)
	{
		#pragma acc exit data delete(a[i:1])
		free(a[i]);
	}
	#pragma acc exit data delete(a)
	free(a);

    return 0; 
}

Compile command:

pgcc -acc -Minfo a.c -o a.exe

Compile response:

main:
     12, Generating enter data create(a[:M][:1])
     18, Generating enter data create(a[i][:i])
     27, Memory set idiom, loop replaced by call to __c_mset4
     33, Generating update device(a[i][:i])
     35, Generating present(a[:][:])
         Generating Tesla code
         36, #pragma acc loop gang /* blockIdx.x */
         40, #pragma acc loop vector(128) /* threadIdx.x */
     40, Loop is parallelizable
     52, Generating update self(a[i][:i])
     66, Generating exit data delete(a[:1][i])
     69, Generating exit data delete(a[:1][:1])

Execution error message:

hostptr=00000000005EDEE0,stride=1,-1,size=1x1,extent=-1x-1,eltsize=4,name=a,flags=0x200=present,async=-1,threadid=1
Present table dump for device[1]: NVIDIA Tesla GPU 0, compute capability 6.1, threadid=1
host:0000000000000000 device:0000000000000000 size:0 presentcount:0+1 line:18 name:a
host:0000000000000000 device:0000000000000000 size:0 presentcount:0+1 line:18 name:a
host:0000000000000000 device:0000000000000000 size:0 presentcount:0+1 line:18 name:a
host:00000000005EDEE0 device:0000000B03C00000 size:32 presentcount:2+1 line:12 name:a
host:000000000D118CB0 device:0000000B03C00400 size:8 presentcount:0+1 line:18 name:a
host:000000000D119070 device:0000000B03C00600 size:12 presentcount:0+1 line:18 name:a
host:000000000D1190C0 device:0000000B03C00200 size:4 presentcount:0+1 line:18 name:a
allocated block device:0000000B03C00000 size:512 thread:1
allocated block device:0000000B03C00200 size:512 thread:1
allocated block device:0000000B03C00400 size:512 thread:1
allocated block device:0000000B03C00600 size:512 thread:1
FATAL ERROR: data in PRESENT clause was not found on device 1: name=a host:00000000005EDEE0
 file:E:\SourceCode\OpenaccCExamples\acc_2darray_jagged\acc_2darray_jagged.c main line:35

Please help me. Thank you very much!

Hi Guangyuan Kan,

I’m not sure why it works for you with 18.10 since I show the same behavior for 18.4, 18.10, and 19.4.

The present check of “a” is checking if the first element of “a” is present on the device. However since you skip the creation of the first element, it’s not actually present and hence the error.

To fix, comment out the first “if (i >0)” so the first element’s array is also created.

Hope this helps,
Mat