New GPU Library Lowers Compute Costs for Apache Spark ML

Originally published at: https://developer.nvidia.com/blog/new-gpu-library-lowers-compute-costs-for-apache-spark-ml/

Spark MLlib is a key component of Apache Spark for large-scale machine learning and provides built-in implementations of many popular machine learning algorithms. These implementations were created a decade ago, but do not leverage modern computing accelerators, such as NVIDIA GPUs. To address this gap, we have recently open-sourced Spark RAPIDS ML (NVIDIA/spark-rapids-ml), a Python…