High-Performance GPU Computing in the Julia Programming Language

jwitsoe · October 24, 2017, 7:55am

Originally published at: High-Performance GPU Computing in the Julia Programming Language | NVIDIA Technical Blog

Julia is a high-level programming language for mathematical computing that is as easy to use as Python, but as fast as C. The language has been created with performance in mind, and combines careful language design with a sophisticated LLVM-based compiler [Bezanson et al. 2017]. Julia is already well regarded for programming multicore CPUs and large parallel…

anon42921929 · November 22, 2017, 8:49am

what about host 2 device memory transfers?

anon57635164 · November 22, 2017, 6:20pm

Just want to point out that there are a bunch of comments and discussion around this on Hacker News

https://news.ycombinator.co...

anon92088217 · November 23, 2017, 6:44am

Constructing the CuArray performs a host-to-device memory transfer, whereas converting it back to a regular Array fetches the memory back.

anon42921929 · November 23, 2017, 11:21am

is there support for asynchronous transfers? multiple streams? concurrent kernel and memory transfer?

anon92088217 · November 23, 2017, 11:54am

Partially, eg. streams are supported and can be used for kernel execution, but asynchronous transfers are not wrapped right now. It isn't much work to add though, and I'm currently redesigning the memory buffer interface so I'll see about adding it: https://github.com/JuliaGPU...

If there's similar missing features you'd want to use, don't hesitate to file an issue at CUDAdrv or CUDAnative.