May be it is time for some one to write a book about cuda in mathematics and physics.
I have been teaching Phys/Math for the last 15+ years in CS and/or Eng.
At the start, i have found learning cuda is confusing. There is a lot of material
accumulated along the years and Nvidia changed their mind (in the good way) over the years.
From pure C cuda to python cuda. And, as it is well known python is already a very
big community. Morover, at the start, WSL was already confusing. I had to register in the
Windows Insider and the old problem about drivers but then I found learning cuda is
very chaotic. Different approachs and each one claiming, it is the best. Doing it in pure
c is a step backward, no garbage collection no dynamical arrays and all old problems
that were “solved” by python.
Now, I understand Nvidia is investing a lot of money in the AI. Not bad, Tensors and Linear Algebra
are extremely useful everywhere.
I found nvc++ is is a big winner for WSL. To be honest I like to see cython.
How can i mix nvc++ and cython together in an effective way. how to pass nvc++ decorators
to accelerate cython.
Numba is doing good in making cuda-kernels in pythonwhich is very good. Writing big cuda
kernels in c is not effective and debugging is problamatic.
It was impressive to see how it is easy to work with big arrays (bigger than the RAM)
with Dask. I am expecting more form cuDF. It shall be nice to have a hetrogeneous array
defined over the CPU and the GPU. and accelerated sperately over the CPU/GPU and then
fused together using sum/reduce.
Usually i found your mini-courses 4 lectures with sildes(pdf) and notebooks over github is the
best way.
It is extremly usefull for all of us to hear from Nvidia about what is better. After all you are
spending 100% of your time doing cuda so you solved more problems.
But scientific applications is still lacking behind. It shall be nice to see cython or even sage
with cuda. Parallel decorator in sage is usefull and quite effective. I have used with some-gigantic
g=7 riemann surface (RS) calculations. RS is the big sister of sin/cos from simply periodic into
multiply periodic, very extensive calculation and extremely needed. You can not solve the simple
pendulum with sin/cos, you shall need doubly-periodic, not to mention spin top and other
3 and higher dimensionals applications like in solitons.
Now, i am spending my spare time with Dask, cupy, numba, cuDF.
Thank you for all your hardd work and keep going on. We always need to hear
your point of view and your prespective for better future of cuda.