hi, is it correct that xavier Carmel cores are missing special functional units for math functions like exp/log/sin/cos etc?Could you suggest the most optimal way to compute (with NEON?) trig/exp/log and other math functions?
My use case is achieving maximum math throughput using Arm NEON float32x4_t without using GPU. Or by other means such as internal HW functional units which i’m not sure if carmel has them or not.
Not sure for your case, but you may get some speedup using llvm-11 or llvm-12 that support Carmel CPU.
Maybe you can get more with OpenCL, maybe building pocl and using pthread as POCL device.
Thank you, this is interesting information. Do you happen to know if the CPU has some hw functional units that allow (with long instruction latency) to compute math functions?
I can’t tell for this particular feature, but you would find details about Carmel CPU in TRM.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.