I’m new to the Jetson TX1 as well as the SIMD instructions on NEON. I’ve been working with x86 architectures prior and used dlib (dlib.net) for some of my applications. When I wanted to port it to the TX1 however, I got an extreme dropoff in performance because dlib uses SSE/AVX instructions in its code. There isn’t any NEON support at the moment.
Any advice for how I can begin adding NEON instruction support for this library? I’m really not sure where to start. All the SIMD code is in the following files: dlib/dlib/simd at master · davisking/dlib · GitHub
Thanks and I appreciate the help!