hi, is it correct that xavier Carmel cores are missing special functional units for math functions like exp/log/sin/cos etc?Could you suggest the most optimal way to compute (with NEON?) trig/exp/log and other math functions?

What is your use cas?

For C/C++ a lot of math functions can be found in glibc. When using Python look at Numpy math functions or Cupy which speeds up computation by using GPU.

My use case is achieving maximum math throughput using Arm NEON float32x4_t without using GPU. Or by other means such as internal HW functional units which i’m not sure if carmel has them or not.

Not sure for your case, but you may get some speedup using llvm-11 or llvm-12 that support Carmel CPU.

Maybe you can get more with OpenCL, maybe building pocl and using pthread as POCL device.

Thank you, this is interesting information. Do you happen to know if the CPU has some hw functional units that allow (with long instruction latency) to compute math functions?

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.