This is probably a stupid suggestion… Just a thought.
What if we could overload the “<<<” and “>>>” operators as “pre-launch” and “post-launch” hooks…
This way, I could implement a host-vector c++ class that will automaticaly cudaMalloc() and cudaMemcpy before and after kernel launch…
Won’t that b good?
Maybe, but are <<< and >>> actually operators?
The way I see it is just a new type of parenthesis, like (), < >, , { } etc… and it is interpreted by nvcc preprocessor.
On the other hand, you have C++ API with useful functions
cudaConfigureCall()
cudaFuncGetAttributes()
cudaSetupArgument()
cudaLaunch()
which do not use <<< and >>> operators at all and do not even need nvcc.
I am sure you could construct a wrapper function for all those and hook up whatever you need.
Doesn’t sound too stupid, but I can’t agree they are actually operators. They don’t behave like operators and they don’t have any purpose expect to act as directives to the CUDA compiler. In that sense, they are functionally much closer to OpenMP or C preprocessor directives than operators in the C++ sense. It certainly might be nice to have some user definable actions attach to them (something like the OO concept of constructors and destructors), but that might be better done using C++ templating.
What I am going to be very interested in seeing is how the CUDA semantics we have now are going to be transplanted in Fortran. Now that will be interesting…
Edit: looks like PDan and I are of a similar mind on this one…