Originally published at: Efficiently Scaling Polars GPU Parquet Reader | NVIDIA Technical Blog
When working with large datasets, the performance of your data processing tools becomes critical. Polars, an open-source library for data manipulation known for its speed and efficiency, offers a GPU-accelerated backend powered by cuDF that can significantly boost performance. However, to fully leverage the power of the Polars GPU backend, it’s essential to optimize the…
Hi,
Thank you for the blog—it was very interesting! I have a technical question about the y-axis in Figures 2 and 3: how is the throughput calculated? In Figure 3, the throughput suddenly reaches a maximum of ~100 GB/s, which surprises me.
So I wonder how this metric is calculated, especially since the blog doesn’t mention the GPU type. Without that context, it’s unclear how a single GPU could achieve 100 GB/s, as this exceeds typical single-GPU PCIe bandwidth limits.
Could you clarify these details? Thank you for your time!
Ping again.