Building Apache Arrow with CUDA on Jetsons

znmeb · October 29, 2020, 10:46pm

Has anyone successfully built Apache Arrow (C++ and PyArrow) for the Jetson / L4T with CUDA enabled? There have been attempts - see https://gist.github.com/heavyinfo/04e1326bb9bed9cecb19c2d603c8d521 - and Conda Forge has CPU-only packages that will run on the Jetson via miniforge, but that’s the best I’ve been able to come up with.

kayccc · October 29, 2020, 11:11pm

HI znmeb,

We never tried that before, not sure how to do, may other developers share experiences if they done something similar.

lorenzave · October 30, 2020, 8:31am

Hi @znmeb, I managed to compile version 0.17.1 after 7h of battling against it.

wget https://github.com/apache/arrow/archive/apache-arrow-0.17.1.zip

I imagine you need this for TF 2.3 + object_detection, at least thats why we need this.
I followed more or less the instructions in the link you published. For cpp it worked fine but with python it was the real pain as there seems to be a bug with the architecture on the cmakes.
Then when running the python installation the key part is to force some cmake variables to make it pass.
Look at the example.

PYARROW_CMAKE_OPTIONS=“-DARROW_ARMV8_ARCH=armv8 -DCXX_SUPPORTS_ARMV8_ARCH=true” python3 setup.py build_ext --inplace

It would be nice it @2024a could make sure that there are binaries available for all the dependencies to run properly TF stuff(especially that they now bought ARM it is in their best interest), I know there is close to an infinite amount of libraries but just the ones needed to run the basic TF packages would be a great gain. The same happens with the last version of OpenCV when installing from pip you need to build from sources and it takes an eternity.

Let me know if you find any issues there.

znmeb · October 30, 2020, 8:53am

I have a semi-kosher way to do this working, but I want to stress-test it a bunch before I call it solved. The strategy is:

Install Miniforge for aarch64 (https://github.com/conda-forge/miniforge). That gets you conda and all of the binaries in the conda-forge channel that run on aarch64. The CPU versions of Arrow C++ and pyarrow are already in conda-forge, so if you don’t care about CUDA you’re done.
Clone the conda-forge feedstock for Apache Arrow. Hack the build scripts so they see the CUDA libraries on the Jetson, then do a “conda build”. I had that running but I’m not sure what the numpy version has to be for things to work, and I don’t have any CUDA-aware Arrow tests to run to make sure it’s working.

IMHO NVIDIA should look at integrating with conda-forge. Miniforge has the CPU-only builds all automated and wired up to continuous integration, so it would simply be a matter of creating a Jetson channel and creating feedstocks for all the Python packages that use CUDA. I’m going to do Arrow as part of my personal project (https://github.com/edgyR) but I have no plans to contribute it upstream.

lorenzave · October 30, 2020, 8:59am

Sounds like a reasonable solution too, just now that it is possible to get it compiled, just a big pain :D
Sounds like an interesting project, good luck!!

znmeb · November 9, 2020, 7:01am

The biggest pain point is the lack of widespread free CI/CD resources for building the packages in the cloud. You have to have a Jetson in your home / lab to build and test things. Sure, an AGX-Xavier is only $700US but for repeatability you need a cloud build service.

znmeb · December 7, 2020, 11:35pm

Update - I’ve abandoned this for the foreseeable future. There are just too many risky undocumented unsupported time sinks at present. arrow-cpp / pyarrow work just fine on the Jetson CPU.

axel.moinet · December 9, 2020, 4:21pm

Thanks, I just had the same problem, using object_detection on TF 2.3.1. With the flags you provided compilation finally succeeded.

Agreed, I just needed to use OpenCV 4.5.0 with TF 2.3.1 and object_detection, forced to recompile everything from scratch, at first because no OpenCV 4.5.0 was provided, then I realized OpenCV 4 needs numpy 1.19 but Tensorflow needs numpy < 1.19… OpenCV was long but not so complicated to build, TF compilation was much harder (needed to patch TF 2.3.1 for numpy 1.19 support, thankfully a patch is available on TF github here )

TF took about 50 hours to build on my test lab Jetson Nano, and I needed to add swap on USB storage for it to succeed… Besides that, the Jetson Nano is really awesome, it will be really great if NVIDIA could provide pre-built packages for this !

hyuk199jhx15 · April 13, 2021, 10:20am

After fallowing https://gist.github.com/heavyinfo/04e1326bb9bed9cecb19c2d603c8d521 with 3.0.0 version
I got error

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-5-982e55aa9aaf> in <module>
----> 1 import pyarrow
  2 

~/.local/lib/python3.6/site-packages/pyarrow/__init__.py in <module>
 61 _gc_enabled = _gc.isenabled()
 62 _gc.disable()
---> 63 import pyarrow.lib as _lib
 64 if _gc_enabled:
 65     _gc.enable()

ImportError: libarrow_python.so.300: cannot open shared object file: No such file or directory

Can I get some help?

znmeb · April 13, 2021, 3:42pm

I have built Arrow successfully for C++, Python and R using Miniforge. The scripts are here:

Miniforge installer: https://github.com/edgyR/edgyR-containers/blob/main/build-scripts/edgyr/miniforge.sh
Arrow 3.0.0: https://github.com/edgyR/edgyR-containers/blob/main/build-scripts/edgyr/Installers/pyarrow-cuda-git.sh

This is running in a container based on the NGC ML image but it should run on bare metal.

hyuk199jhx15 · April 17, 2021, 5:24am

Thank you for reply.
I got stopped at ‘Arrow 3.0.0’ - ‘Patching arrow-cpp’.
It print
diff: /home/USERNAME/Installers/etc/CMakeLists.txt-cuda-patch: No such file or directory
cp: cannot stat '/home/USERNAME/Installers/etc/CMakeLists.txt-cuda-patch': No such file or directory

I remove /usr/bin/time part at ‘Miniforge installer’ and ‘Arrow 3.0.0’ because I don’t have such directory
I’m on Jetson AGX Xavier bare metal not docker.

znmeb · April 17, 2021, 5:43am

Oops … that’s in the directory - try https://github.com/edgyR/edgyR-containers/blob/main/build-scripts/edgyr/Installers/etc/CMakeLists.txt-cuda-patch

hyuk199jhx15 · April 17, 2021, 7:06am

It stop with

Configuring arrow-cpp
~/arrow/cpp/build ~

I think problem is >> $EDGYR_LOGS/pyarrow-cuda-git.log 2>&1
when I type that it say -bash: /pyarrow-cuda-git.log: Permission denied
I wonder where is $EDGYR_LOGS/pyarrow-cuda-git.log

znmeb · April 17, 2021, 7:26am

This is all in the Docker image - you can assign that symbol to any directory where you have write permissions.

znmeb · April 17, 2021, 7:27am

I should extract that into a stand-alone script.

znmeb · April 19, 2021, 5:45am

By the way, the Arrow team is in a release cycle moving to 4.0.0. Once that drops I’ll be updating this script and I’ll factor it out into a stand-alone script in the process.