I am trying to train an XGBoost model on my WSL2-Ubuntu 20.04 setup. However, I get an error on my local machine when I set ‘tree_method’: ‘gpu_hist’ in params:
"Exception in gpu hist: NCCL failure : unhandled system error"
I can fit a model using ‘tree_method’: ‘hist’ although that means we’re now fitting on the CPU rather than the GPU, which defeats the purpose of using RAPIDS+Cuda! It was also very slow.
Note as part of your DLI course FUNDAMENTALS OF ACCELERATED DATA SCIENCE WITH RAPIDS , I was able to train an XGBoost model on the cloud-based GPU cluster (i.e. not my local notebook) with ‘tree_method’: ‘gpu_hist’ in params.