cudaBindTexture synchronization question

cudaprogrammer · September 4, 2011, 1:46pm

Since a cudaBindTexture call is not specific to a stream, I am wondering if both bind and unbind calls have implicit device-wide synchronization. The Cuda C manual does not specify this in the implicit synchronization section when discussing streams, so I am a bit confused. For instance, it would not seem you could bind a texture, execute a kernel in one stream, then bind that same texture again (to a different piece of memory) and execute another kernel in another stream without at least some kind of stream synchronization (or device wide synchronization). There must be some documentation on this. Can someone point me in the right direction? Thank you in advance for your help.

cudaprogrammer · September 10, 2011, 5:49pm

For anyone might be interested in the answer, I ran a characterization test and found the following.

For the case of two kernels in the same stream:
bind a texture->call a kernel in one stream that uses the texture->unbind the texture->call a kernel in the same stream that uses the same texture.
Both kernels correctly access the texture memory–the unbind call appears to do nothing (as may be indicated by the fact that the API defines the only return value of cudaUnbindTexture as cudaSuccess).

For the case of two kernels in separate streams (that have been verified to be running concurrently):
bind a texture->call a kernel in one stream that uses the texture->bind the texture to a separate location in global memory->call a kernel in a different stream that uses the same texture (as the first)
Both kernels correctly access the different areas of global memory that the single texture was bound (surprising result to me).
I will mention again that, yes, I am sure the two kernels were operating concurrently when this test was executed.

Take-away: regardless of streams, wherever a texture was bound to–before launching a kernel that uses it–the kernel will correctly access the global memory (that the texture was bound to), regardless of what a user might do to the texture when the kernel is running. Also, the cudaUnbindTexture call appears to do absolutely nothing with regards to a previously bound texture (at least from a kernel point-of-view).

cbuchner1 · November 26, 2013, 9:42am

We’re currently facing this issue of using one common texture references on multiple streams, with different global memory bound to the texture reference on each stream. Your posting seems to confirm that this doesn’t cause a code correctness problem apparently.

However this stackoverflow question indicates that there are some kinds of forced synchronization issues with respect to asynchronous memcpy operations on binding and unbinding textures, which may impede performance:

http://stackoverflow.com/questions/12411896/cuda-streams-texture-binding-and-async-memcpy

Topic		Replies	Views
Texture Synchronization Question CUDA Programming and Performance	1	700	September 10, 2011
texture binding and stream CUDA Programming and Performance	3	2078	January 11, 2012
cudaThreadSynchronize() with texture binding CUDA Programming and Performance	12	5617	August 31, 2010
cudaBindTexture CUDA Programming and Performance	4	2452	August 10, 2010
Unbind and rebind texture CUDA Programming and Performance	3	6233	January 15, 2009
use of textures in a kernel that is called many times CUDA Programming and Performance	1	2763	December 21, 2011
dynamic update of texture in a kernel is it worth it CUDA Programming and Performance	4	1647	May 11, 2009
Texture bind reuse or rebind CUDA Programming and Performance	4	642	June 22, 2022
texture memory binding performance CUDA Programming and Performance	5	2724	June 13, 2012
how are texture references threated with respect to cuda streams? CUDA Programming and Performance	1	1758	April 22, 2013

cudaBindTexture synchronization question

Related topics