Persistent storage on nvidia opencl GPU

poona · July 2, 2009, 8:25am

Hi. Is there any way for the gpu to hold persistent storage of data so that we don’t have to repeatedly the data send, every time we invoke the kernel.

For example if the kernel takes 2 arguements - constant_data(let us say this is 10000 bytes and the data is constant) and variable_data(say 1000-10000 bytes and the data varies between every invocation of the kernel), and we invoke the kernel 10000 times each time with the same constant_data but different variable_data, I have to bear the extra overhead of sending the same constant_data, inspite of this data being constant over multiple kernel invocations. So over multiple kernel invokes, can the constant data be stored in the GPU, so that for the next invocation of the kernel, I don’t have to send the constant data, but I only send the variable data?

Thanks

poona · July 2, 2009, 10:58am

Sorry for the grammatical errors. Re-posting after correction

Hi. Is there any way for the GPU to have persistent data storage(across kernel invocations), so that we don’t have to repeatedly send the data, every time we invoke the kernel.

For example if the kernel takes 2 arguments - constant_data(let us say this is 10000 bytes and the data is constant) and variable_data(say 1000-10000 bytes and the data varies between every invocation of the kernel), and we invoke the kernel from the host program 10000 times, each time with the same constant_data, but different variable_data, I have to bear the extra overhead of sending the same constant_data, inspite of this data being constant across kernel invocations. So over multiple kernel invokes, can the constant data be stored in the GPU, so that for the next invocation of the kernel, I don’t have to send the constant data, but I only send the variable data?

A case where this issues comes up is pattern matching. For example, if I have a kernel which implements the pattern matching algorithm, and I receive data as a stream in the host and the host invokes the kernel multiple times by sending as arguments a chunk of the data stream, and the set of patterns(where the set of patterns is always constant), then one has to bear the cost of sending the patterns, again when the kernel is invoked with the next chunk of data by the host. In such a case, it would be useful if one can store the patterns in the GPU and the kernel invocation from the host just supplies the next chunk of data against which the pattern matching has to be done.

Thanks

seibert · July 2, 2009, 1:44pm

Anything you cudaMemcpy() to the card will be persistent until your program exits, or until you cudaFree() the memory.

poona · July 2, 2009, 2:14pm

Thanks for replying.

So does clCreateBuffer() or clSetKernelArg, use cudaMemcpy()?

How can I find out the opencl API calls, which call cudaMemcpy() at the background?

When I call clSetKernelArg() for a kernel, does the memory object get copied to the GPU immediately, or will it get copied to the GPU only when we do a clEnqueueNDRangeKernel()?

seibert · July 2, 2009, 4:07pm

Whoa, reading failure on my part. I didn’t realize I was reading the OpenCL forum rather than the CUDA forum. I have no idea what I’m talking about (haven’t played with OpenCL yet), so you should ignore me. :)

poona · July 2, 2009, 4:47pm

Oh :). No problem. Thanks for taking time to reply anyways.

Simon_Green · July 3, 2009, 10:31am

OpenCL has the same behaviour as CUDA - global and constant memory is persistent across kernel invocations. Data is only copied from the host to the device when you do clEnqueueWriteBuffer().

poona · July 4, 2009, 5:08am

One quick question. When I write using clEnqueueWriteBuffer(), to which address space does the memory object get copied to? Or for a particualr kernel, the address space for the argument is resolved based on what qualifier that argument has been given in the kernel parameter list?

Topic		Replies	Views
Copy data into shared memory CUDA Programming and Performance	6	1482	May 28, 2009
Constant memory CUDA Programming and Performance	5	5374	November 11, 2009
How to keep memory on device CUDA Programming and Performance	2	5046	January 11, 2012
Advice sought on data transfers between memory CUDA Programming and Performance	0	885	September 5, 2008
Memory usage within GPU CUDA Programming and Performance	2	2405	July 13, 2009
Persistent Memory on Devices Multi-GPU Programming CUDA Programming and Performance	1	696	February 7, 2017
Permanent local memory CUDA Programming and Performance	3	4928	July 16, 2010
Storing pointers / structs on the device without the need for kernel parameters CUDA Programming and Performance	3	1015	April 12, 2011
Using Shared Data resting in GPU across multiple programs CUDA Programming and Performance cuda	3	434	August 8, 2024
Hold Data for reentrance CUDA Programming and Performance	1	6090	August 4, 2010

Persistent storage on nvidia opencl GPU

Related topics