Optimizing and Improving Spark 3.0 Performance with GPUs

jwitsoe · September 1, 2020, 8:15pm

Originally published at: https://developer.nvidia.com/blog/optimizing-and-improving-spark-3-0-performance-with-gpus/

Apache Spark continued the effort to analyze big data that Apache Hadoop started over 15 years ago and has become the leading framework for large-scale distributed data processing. Today, hundreds of thousands of data engineers and scientists are working with Spark across 16,000+ enterprises and organizations. One reason why Spark has taken the torch from…

carolm · September 4, 2020, 3:17pm

If you have any questions or comments about Spark 3.0 performance with GPUs, let us know.

maqeel · August 22, 2022, 9:39am

Hi everyone, @carolm I need support for implement new functionality in rapids udfs with scala, I want to those implementation which are not available in the core implementation. Can you guide?

krajendran · August 23, 2022, 3:56pm

Hi, Can you please file a feature request for the UDF functionality here: Issues · NVIDIA/spark-rapids · GitHub

Please add the data types and format used in the UDF.

thanks
Karthik