Arm neon linking error during cross compile with nvcc


I try to use Arm neon intrinsics the first try is compiling the following example:
The compiler flags options were set according to:

nvcc -m64 -ccbin aarch64-linux-gnu-g++ TestNeon.cpp TestNeon.cpp -o test -Xcompiler “-O3 -ffast-math -flto -march=armv8-a+simd+crypto -mcpu=cortex-a57+simd+crypto”

And also I tried: aarch64-linux-gnu-g++ TestNeon.cpp TestNeon.cpp -o TestNeon -march=armv8-a+crypto -mcpu=cortex-a57+crypto

In both cases I got linking error:
/tmp/cciyIoR7.o: In function add3(__Uint8x16_t*)': TestNeon.cpp:(.text+0x0): multiple definition of add3(__Uint8x16_t*)’
/tmp/ccZSw9aL.o:TestNeon.cpp:(.text+0x0): first defined here
/tmp/cciyIoR7.o: In function print_uint8(__Uint8x16_t, char*)': TestNeon.cpp:(.text+0x110): multiple definition of print_uint8(__Uint8x16_t, char*)’
/tmp/ccZSw9aL.o:TestNeon.cpp:(.text+0x110): first defined here
/tmp/cciyIoR7.o: In function main': TestNeon.cpp:(.text+0x1a4): multiple definition of main’
/tmp/ccZSw9aL.o:TestNeon.cpp:(.text+0x1a4): first defined here
collect2: error: ld returned 1 exit status

When I using the same code with Nsight for aarch64 without any addition setting the compile and liking process success, and I am able to test my code over the Tx2 platform.

What am I doing wrong that the linking is failing and what should I do in order to fix it?


Seems you specify twice the source file TestNeon.cpp, so you have duplicate symbols at link time.

You may try instead:

nvcc -m64 -ccbin aarch64-linux-gnu-g++ TestNeon.cpp -o test -Xcompiler "-O3 -ffast-math -flto -march=armv8-a+simd+crypto -mcpu=cortex-a57+simd+crypto"