Double atomicAdd not supported in 16.10?

Hello.

I recently updated my compiler from 16.5 student edition to 16.10 community edition.

But 16.10 cannot compile my code due to double precision atomicAdd.

Error messages look like this:

ptxas /tmp/pgaccJVJdPAZ24pQo.ptx, line 22460; error : Type or alignment of argument does not match formal parameter ‘__pgi_atomicAddd_llvm_param_1’
Error: /tmp/pgacc_V8d6SN9Ugxh.gpu (2740, 25): parse ‘@__pgi_atomicAddd_llvm’ defined with type ‘double (i8*, double)*’

So I installed 16.5 professional edition and only used license file from the community edition, and it works fine.

My code runs fine even in 15.10 (Visual Fortran).

What happened in 16.10?

Hi CNJ,

What happened in 16.10?

I’m not sure and don’t see any problem reports regarding it. Can you please post a reproducing example so I can investigate?

  • Mat

I will try to make one.

If I change the variables to single precision, it compiles.

But one more problem occured.

An OpenMP region that yielded correct results in 16.5 does not anymore in 16.10.

Please tell me if there’s any change regarding OpenMP policy between 16.5 and 16.10, expecially when a subroutine or function is called within the parallel region.

Some variables seem to work as save variables.

Please tell me if there’s any change regarding OpenMP policy between 16.5 and 16.10, expecially when a subroutine or function is called within the parallel region.

Not that I’m aware of.

Some variables seem to work as save variables.

Are you data initializing them? If so, data initialization variables implicitly get the SAVE attribute. Though this behavior is part of the Fortran language itself and we did not change between 16.5 and 16.10.

-Mat

There are some, but they work like parameters (read-only).

I have some local automatic arrays with arbitrary array index.

I don’t think this will matter.

Can you send a reproducer to PGI Customer Service (trs@pgroup.com)?

Ask them to forward it on to me and I’ll see if I can determine the problem.

-Mat

Commenting out

CALL OMP_SET_MAX_ACTIVE_LEVELS(2)

solves the problem.

With that line, even if I don’t use nested parallelism, result becomes undeterministic.

But theoretically that line should not make any differences if nested parallelism is not used.

I will see if I can make any reproducing examples.

So, my question was misleading.

It is not the problem of compiler version.

It is the problem of nested parallelism itself.

What does OMP_SET_MAX_ACTIVE_LEVELS do to the parallel region which does not employ nested parallelism?

Hi CNJ,

This doesn’t make sense. There must be a logic error in the code or possibly a compiler issue.

We’ll need a reproducing example so we can investigate. Can you send one to PGI Customer Service (trs@pgroup.com) and ask them to forward it to me?

Thanks,
Mat