cudaStreamCallbackBlocking flag usage

I’m trying to use callbacks to prepare data for streams before asynchronous memory copy from the pinned host memory to the device memory.

checkCudaErrors(cudaStreamAddCallback(workload->stream, myStreamCallbefore, workload, cudaStreamCallbackBlocking));

From the cuda programming guide PG-02829-001_v5.0, page 33, in the bottom, it says “A callback can be specified as blocking by using the cudaStreamCallbackBlocking flag when adding it to a stream. The commands that are issued in a stream after a blocking callback do not start executing before the callback has completed.”

But somehow when I include this flag in the cudaStreamAddCallback function call rather than using the default 0, it doesn’t compile…

Anyone could shed some light? Thanks!! ^_^