Asynchronous IO for large images

brdavs · January 6, 2009, 4:30pm

I am interested in processing very large images that will not fit into
neither main CPU memory nor GPU memory.

So, I want to the the following:

Read a tile asyncronoulsy from disk
Copy it from the CPU memory to GPU memory
Process the tile with CUDA kernel

Can anybody provide an example of how to do this
in the most efficient way so that disk IO,
copying from CPU to GPU memory and processing
can be overlapped/interleaved.

Any ideas/examples are appreciated.

paulius · January 6, 2009, 6:53pm

Look at the simpleStreams sample in the CUDA SDK. It shows how to overlap GPU processing with a copy from GPU to CPU. You can use the same idea, but add CPU2GPU memcopies.

Paulius

Topic		Replies	Views
Asynchronous processing? CUDA Programming and Performance	0	1673	February 23, 2011
Memory Copy Threads CUDA Programming and Performance	2	1997	July 27, 2007
Strategy for Multiple Images CUDA Programming and Performance	0	441	April 8, 2017
Mixing OpenGL and CUDA CUDA Programming and Performance	0	1341	September 22, 2010
what can CPU do during GPU is computing? CUDA Programming and Performance	2	1136	June 29, 2012
Auto-transfer memory from CPU when needed CUDA Programming and Performance	4	1034	September 11, 2013
GPU suitability for parallel file processing application CUDA Programming and Performance	2	2777	December 17, 2009
Overlapping kernel execution and memory copy CUDA Programming and Performance	6	9744	September 22, 2007
Two Questions! CUDA Programming and Performance	3	5262	December 3, 2007
memory copy overlap CUDA Programming and Performance	7	14729	March 29, 2008

Asynchronous IO for large images

Related topics