arm_neon.h error

hello ,I wondered why we should turn off the neon when Compile the code to consult the error ? Is it because the armv8 is 64-bit ,can’t run 32-bit simultaneously?

/usr/lib/gcc/aarch64-linux-gnu/5/include/arm_neon.h(1117): error: identifier "__builtin_aarch64_addhn2v8hi" is undefined

/usr/lib/gcc/aarch64-linux-gnu/5/include/arm_neon.h(1123): error: identifier "__builtin_aarch64_addhn2v4si" is undefined

/usr/lib/gcc/aarch64-linux-gnu/5/include/arm_neon.h(1129): error: identifier "__builtin_aarch64_addhn2v2di" is undefined

/usr/lib/gcc/aarch64-linux-gnu/5/include/arm_neon.h(1135): error: identifier "__builtin_aarch64_addhn2v8hi" is undefined

/usr/lib/gcc/aarch64-linux-gnu/5/include/arm_neon.h(1143): error: identifier "__builtin_aarch64_addhn2v4si" is undefined

/usr/lib/gcc/aarch64-linux-gnu/5/include/arm_neon.h(1151): error: identifier "__builtin_aarch64_addhn2v2di" is undefined

/usr/lib/gcc/aarch64-linux-gnu/5/include/arm_neon.h(1159): error: identifier "__builtin_aarch64_raddhn2v8hi" is undefined

/usr/lib/gcc/aarch64-linux-gnu/5/include/arm_neon.h(1165): error: identifier "__builtin_aarch64_raddhn2v4si" is undefined

Error limit reached.
100 errors detected in the compilation of "/tmp/tmpxft_00007538_00000000-6_elas_gpu.cpp1.ii".
Compilation terminated.
CMake Error at cuda_compile_generated_elas_gpu.cu.o.cmake:266 (message):
  Error generating file
  /home/data/libelas-gpu/build/CMakeFiles/cuda_compile.dir/GPU/./cuda_compile_generated_elas_gpu.cu.o


CMakeFiles/libelas_gpu.dir/build.make:70: recipe for target 'CMakeFiles/cuda_compile.dir/GPU/cuda_compile_generated_elas_gpu.cu.o' failed
make[2]: *** [CMakeFiles/cuda_compile.dir/GPU/cuda_compile_generated_elas_gpu.cu.o] Error 1
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/libelas_gpu.dir/all' failed
make[1]: *** [CMakeFiles/libelas_gpu.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

but I can’t close the neon ,because the project has sse instruction.And I convert the sse to neon.I wonder if I should change neon to aarch64 to solve the problem .But it doesn’t work.
Thank you for your help !

I guess the reason is because of nvcc.

Something similar on old platform.

https://devtalk.nvidia.com/default/topic/833487/jetson-tk1/-jetson-tk1-can-t-compile-in-neon-instructions-using-nvcc/

Could you share your code with us? or a simplified version with sse instruction included is also fine.

https://github.com/goldbattle/libelas-gpu
It’s the code I run.And I already convert sse to neon with https://github.com/otim/SSE-to-NEON
I’m very glad for your quackily reply!!It has constructed me a few days.

I already tried to use opencv 3.1.0 with neon turned off.But it doesn’t work.And at the first time I use opencv 3.3.0

Could you just share the steps of reproducing the error?

I just compile it https://github.com/goldbattle/libelas-gpu with cmake and make .And it produce the error.

I hit below error instead of yours during compiling your code.

nvidia@tegra-ubuntu:~/libelas-gpu/build$ make -j5
[  4%] Building NVCC (Device) object CMakeFiles/cuda_compile.dir/GPU/cuda_compile_generated_elas_gpu.cu.o
[  9%] Building NVCC (Device) object CMakeFiles/cuda_compile.dir/cuda_compile_generated_main_gpu.cu.o
[ 13%] Building CXX object CMakeFiles/libelas_test.dir/main_test.cpp.o
c++: error: unrecognized command line option ‘-msse’
c++: error: unrecognized command line option ‘-msse2’
c++: error: unrecognized command line option ‘-msse3’
CMakeFiles/libelas_test.dir/build.make:62: recipe for target 'CMakeFiles/libelas_test.dir/main_test.cpp.o' failed
make[2]: *** [CMakeFiles/libelas_test.dir/main_test.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
[ 18%] Building CXX object CMakeFiles/libelas_cpu.dir/main_cpu.cpp.o
[ 22%] Building CXX object CMakeFiles/libelas_test.dir/CPU/descriptor.cpp.o
c++: error: unrecognized command line option ‘-msse’
c++: error: unrecognized command line option ‘-msse2’
c++: error: unrecognized command line option ‘-msse’
c++: error: unrecognized command line option ‘-msse3’
c++: error: unrecognized command line option ‘-msse2’
c++: error: unrecognized command line option ‘-msse3’
CMakeFiles/libelas_test.dir/build.make:86: recipe for target 'CMakeFiles/libelas_test.dir/CPU/descriptor.cpp.o' failed
make[2]: *** [CMakeFiles/libelas_test.dir/CPU/descriptor.cpp.o] Error 1
CMakeFiles/libelas_cpu.dir/build.make:62: recipe for target 'CMakeFiles/libelas_cpu.dir/main_cpu.cpp.o' failed
CMakeFiles/Makefile2:141: recipe for target 'CMakeFiles/libelas_test.dir/all' failed
make[2]: *** [CMakeFiles/libelas_cpu.dir/main_cpu.cpp.o] Error 1
make[1]: *** [CMakeFiles/libelas_test.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
make[2]: *** Waiting for unfinished jobs....
[ 27%] Building CXX object CMakeFiles/libelas_cpu.dir/CPU/descriptor.cpp.o
c++: error: unrecognized command line option ‘-msse’
c++: error: unrecognized command line option ‘-msse2’
c++: error: unrecognized command line option ‘-msse3’
CMakeFiles/libelas_cpu.dir/build.make:86: recipe for target 'CMakeFiles/libelas_cpu.dir/CPU/descriptor.cpp.o' failed
make[2]: *** [CMakeFiles/libelas_cpu.dir/CPU/descriptor.cpp.o] Error 1
CMakeFiles/Makefile2:104: recipe for target 'CMakeFiles/libelas_cpu.dir/all' failed
make[1]: *** [CMakeFiles/libelas_cpu.dir/all] Error 2
nvcc fatal   : Value 'sm_20' is not defined for option 'gpu-architecture'
CMake Error at cuda_compile_generated_main_gpu.cu.o.cmake:207 (message):
  Error generating
  /home/nvidia/libelas-gpu/build/CMakeFiles/cuda_compile.dir//./cuda_compile_generated_main_gpu.cu.o


CMakeFiles/libelas_gpu.dir/build.make:63: recipe for target 'CMakeFiles/cuda_compile.dir/cuda_compile_generated_main_gpu.cu.o' failed
make[2]: *** [CMakeFiles/cuda_compile.dir/cuda_compile_generated_main_gpu.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
nvcc fatal   : Value 'sm_20' is not defined for option 'gpu-architecture'
CMake Error at cuda_compile_generated_elas_gpu.cu.o.cmake:207 (message):
  Error generating
  /home/nvidia/libelas-gpu/build/CMakeFiles/cuda_compile.dir/GPU/./cuda_compile_generated_elas_gpu.cu.o


CMakeFiles/libelas_gpu.dir/build.make:70: recipe for target 'CMakeFiles/cuda_compile.dir/GPU/cuda_compile_generated_elas_gpu.cu.o' failed
make[2]: *** [CMakeFiles/cuda_compile.dir/GPU/cuda_compile_generated_elas_gpu.cu.o] Error 1
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/libelas_gpu.dir/all' failed
make[1]: *** [CMakeFiles/libelas_gpu.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

Also, why do you use sm_20…?
Which cuda version are you using?

Any feedback? Thanks.

I’m sorry for the delay.I use cuda9 and I just use sm_62.

nvidia@tegra-ubuntu:/home/data/libelas-gpu$ grep -rn "sm_20"*
build/CMakeFiles/libelas_gpu.dir/libelas_gpu_generated_main_gpu.cu.o.depend:207: "/usr/local/cuda-9.0/include/sm_20_atomic_functions.h"
build/CMakeFiles/libelas_gpu.dir/libelas_gpu_generated_main_gpu.cu.o.depend:208: "/usr/local/cuda-9.0/include/sm_20_atomic_functions.hpp"
build/CMakeFiles/libelas_gpu.dir/libelas_gpu_generated_main_gpu.cu.o.depend:209: "/usr/local/cuda-9.0/include/sm_20_intrinsics.h"
build/CMakeFiles/libelas_gpu.dir/libelas_gpu_generated_main_gpu.cu.o.depend:210: "/usr/local/cuda-9.0/include/sm_20_intrinsics.hpp"
build/CMakeFiles/libelas_gpu.dir/GPU/libelas_gpu_generated_elas_gpu.cu.o.depend:201: "/usr/local/cuda-9.0/include/sm_20_atomic_functions.h"
build/CMakeFiles/libelas_gpu.dir/GPU/libelas_gpu_generated_elas_gpu.cu.o.depend:202: "/usr/local/cuda-9.0/include/sm_20_atomic_functions.hpp"
build/CMakeFiles/libelas_gpu.dir/GPU/libelas_gpu_generated_elas_gpu.cu.o.depend:203: "/usr/local/cuda-9.0/include/sm_20_intrinsics.h"
build/CMakeFiles/libelas_gpu.dir/GPU/libelas_gpu_generated_elas_gpu.cu.o.depend:204: "/usr/local/cuda-9.0/include/sm_20_intrinsics.hpp"

tx2 did’t support sse instruction.So I convert sse to neon.That is the reason why we have the different error

you can try this .I just commit the code I edit.https://github.com/meixialin/libelas-gpu

Yes, so I asked you to provide the “full steps” to reproduce this issue. Could you share the full steps?

Thanks for sharing new code. I’ll try it.

In the mean time, I would like to ask your help to see if we can have a simplified version of code to reproduce?

As you know, we may not able to help debug code from 3rd party. It would be easier if you can just write a very simple example with NEON enabled and confirmed you hit similar issue.

I’m sorry not to have the simplified version of code.t.And the sse_to_neon.hpp I can’t commit to github.
you can find it here https://github.com/otim/SSE-to-NEON
Thank you for your help.

Do you mean even this latest code on your github does not go through sse to neon convert?

Could you at least provide a version of code that we could just reproduce the same issue as yours?

We cannot help you if you don’t give that one out.