RAPIDS Accelerates Data Science End-to-End

Originally published at: https://developer.nvidia.com/blog/gpu-accelerated-analytics-rapids/

Today’s data science problems demand a dramatic increase in the scale of data as well as the computational power required to process it. Unfortunately, the end of Moore’s law means that handling large data sizes in today’s data science ecosystem requires scaling out to many CPU nodes, which brings its own problems of communication bottlenecks, energy, and…

Hi Drew,
You're getting the unknown runtime error because you may not have the NVIDIA Container Runtime installed on your system.
The nvidia runtime exposes GPUs to applications running in the container.

Follow these steps to install the container runtime, and re-run the docker run command.
https://docs.nvidia.com/ngc...

Thank you!

Will you have a tutorial on developer.nvidia on how to develop with rapids/arrow?

Hey Joe,
There's a wealth of documentation and notebooks on RAPIDS github page here - https://github.com/rapidsai. Also, we continue to add new stories to the RAPIDS blog here - https://medium.com/rapids-ai. Feel free to submit an issue on Github if you have specific questions.

For the 5 X dgx-1 in your chart. Are these p100 GPUs . Or are you saying that 40 v100 is slower than a single 16 v100 dgx2

Hi @indianstallion:disqus, The x-axis on the graph is time, not throughput. So as the axis title says, shorter means faster. So 40 v100s are faster than 16 v100s. But the faster NVLink in DGX-2 means that a single DGX-2 is within reach of the performance of 5x DGX-1...

I see. When you say faster nvlink do you mean the dgx-1 is using nvlink 1.0 instead of nvlink 2.0 , but it is still using v100?

Is there a detailed doc or paper on this I can read up more.

Your GPU modeling diagram lists model optimization after looking at the test data. As page 222 of The Elements of Statistical Learning states : "Ideally, the test set should be kept in a `vault,` and be brought out only at the end of the data analysis."

Hi, where can I find the plans for CUDA 10.1 support in RAPIDS? Official site just states that it's not yet supported, and I can't find any open issues in github which show the status of this task.
Thanks for help!

We plan to support CUDA 10.1 in RAPIDS 0.10 (next release).

Thank you Mark!
Is there an expected release date for the 0.10 release?

The website lists a compute capability of 6.0+ as a prerequisite. Is there any chance to run Rapids with a lower compute capability and/ or will lower capabilities likely be supported in future releases?