Thank You very much for answering my yesterday’ question error Type of argument does not match formal parameter '__pgi_fixuk_param_0' Really, if to drop -Mnollvm compiler option, the code at GitHub - Agar92/QUESTION compiles without errors.
I have a small question here, please. In my code i use cudaMemcpy() CUDA function. I thought that for using cudaMemcpy() i should use the “-Mnollvm” PGI compiler flag. But, as i understand now, i can use cudaMemcpy() without -Mnollvm, i. e. using LLVM compiler. Is it correct?
The code at GitHub - Agar92/QUESTION works properly on CPUs Intel Core i7 and 64-core KNL with PGI 19.4 compiler. But when i try to compile it for launching on GPU Titan V (-ta=tesla:cc70) installed in KNL (-tp=haswell) using PGI 19.4 + OpenAcc (OS - Ubuntu Stable, gcc 7.4.0), it compiles, but fails at runtime with an error:
size of DataHolder is ~0.0019564GB, size of d is ~0.0019564GB
##################################################
push=500000 INJ=500000 N=99999999
##################################################
Failing in Thread:1
call to cuStreamSynchronize returned error 719: Launch failed (often invalid pointer dereference)
Failing in Thread:1
call to cuMemFreeHost returned error 719: Launch failed (often invalid pointer dereference)
Using printf() on GPU, i found that the error occurs in T3DataHolder.h file somewhere in Propagate() function.
I lost any hope to find the error and do not know how to debug it on GPU.
Could You be so kind to help me fix this error and get the code to work?
Andrey
I thought that for using cudaMemcpy() i should use the “-Mnollvm” PGI compiler flag. But, as i understand now, i can use cudaMemcpy() without -Mnollvm, i. e. using LLVM compiler. Is it correct?
CUDA API calls are separate and distinct from which compiler code generator back-end is used. So cudaMemcpy can be used with either the LLVM or older non-LLVM back-end.
Could You be so kind to help me fix this error and get the code to work?
I can try, but the code errors since it can’t find the data directory. Can you please provide a data set so I can run the code?
% ./Test
-Warning-T3NSGangular_RW::default_file:Cant't get $T3_DATA. Use ./data
pdg=2112
secondaryZA=1 Z=0 secondaryA=1
Check: ./data/angular/N/incD/50/T3DSGangular_100010020.bin
RRR: Check in T3NSGangular_RW.cc:
RRR: fname=./data/angular/N/incD/50/T3DSGangular_100010020.bin
terminate called after throwing an instance of 'std::out_of_range'
what(): vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
Abort
Though my guess is that you have a bad address in there someplace. You might try running with CUDA Unified Memory enabled (i.e. -ta=tesla:managed) to see if fixes the issue.
I am afraid You used the code which was at GitHub - Agar92/QUESTION the day before yesterday, when i asked the question error Type of argument does not match formal parameter '__pgi_fixuk_param_0' Yesterday, before posting the question here, i updated the repository. Since yesterday the code does not depend on any external data sets and works properly for me on my machine on CPU. I tried to use -ta=managed flag as -ta=tesla:cc70,managed, but the code did not compile with an error:
PGCC/x86-64 Linux 19.4-0: compilation completed with warnings
/opt/pgi/19/u4/linux86-64-llvm/19.4/share/llvm/bin/opt: /tmp/pgc++0fMbCJ9L17dP.ll:12340:13: error: ‘@__pgi_managed_delete’ defined with type 'void (i8*)’
call void @__pgi_managed_delete (i8 %11, i64 64) nounwind, !dbg !2750
^
make[2]: *** [CMakeFiles/Test.dir/build.make:63: CMakeFiles/Test.dir/nbody.cpp.o] Error 2
make[1]: *** [CMakeFiles/Makefile2:73: CMakeFiles/Test.dir/all] Error 2
make: *** [Makefile:84: all] Error 2
I sent the code to the PGI customer service asking them to forward it to You to be sure You received the correct version of the code.
Thank You.
Andrey.
I was using the package PGI Customer Service forwarded to me, though Alex may not have yet forwarded any additional packages you sent.
Using the updated git repo, I no longer see the data issue, but still get the throw:
% ./Test
terminate called after throwing an instance of 'std::out_of_range'
what(): vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
Abort
Hi, Mat.
I do not know what may be wrong with the git repo, so i have just sent the letter with a .zip folder containing the source code to the PGI customer service. I have just run this code on my laptop on Ubuntu with gcc 7.4.0 using the compile line
Although the code threw some hard coded warning messages at the beginning, it worked to the end properly.
I have checked that all the data files are in the project directory,
so that this error should not repeat again.
Please, let me know if it works.
Hope it will work.
Andrey.
I determined my issue. Since I do multiple builds, I run your cmake command from a “build” sub-directory. Running “Test” from this directory causes the error. Running from the top level directory works.
I’m able to reproduce the error. While I haven’t had time to dig into the problem, it does appear to have something to do with inlining.
As a work-around, I’m able to run successfully if I replace the “-Minline” flag with “-fast -Mnoautoinline” so no inlining is performed.
I’ll try and get back to this as soon as I can to determine the root cause. However, I’ll be attending a GPU Hackathon next week so am very time limited.
I reworked and simplified the code, so now there is not an actual version of the code in the git repo.
I have just sent a letter to the PGI Customer Service with the actual version of the code asking them to forward it to You. If You work on the code, please, take the updated version of the code attached to the letter.
Andrey.