Unlocking Multi-GPU Model Training with Dask XGBoost

Originally published at: https://developer.nvidia.com/blog/unlocking-multi-gpu-model-training-with-dask-xgboost/

As data scientists, we often face the challenging task of training large models on huge datasets. One commonly used tool, XGBoost, is a robust and efficient gradient-boosting framework that’s been widely adopted due to its speed and performance for large tabular data.  Using multiple GPUs should theoretically provide a significant boost in computational power, resulting…

Hi Jiwei Liu,

Thanks for all your effort to summarise your findings in this article. I have a quick question. What if someone wants to perform a Bayesian Optimisation on top of Multi GPU XGBoost using Dask framework? Should we persist the data in order to move across iterations of hyper-parameter tuning, or is there a better practice for the same?

Best Regards,
Saptarshi