cublas matrix format/normal vector format

I would like to use cublas functions, but also use GPU kernel functions to change arrays, without copying them to the host.
To use cublas, you need to do a cublasInit, cublasAlloc and cublasSetMatrix and afterwards:
cublasgetmatrix, cublasFree and cublasShutdown.
Suppose I do a random initialization of a vector allocated with the regular (non cublas) functions

cutilSafeCall(cudaMalloc((void**)&d_seeds, seedz));
and
cutilSafeCall(cudaMemcpy(h_seeds, d_seeds, seedz, cudaMemcpyDeviceToHost));

and without copying it to the host and back to the device, continue to use the same array but now as cublasMatrix. The problem with cublasSetMatrix is that it always does a copy from host to device. You can’t use an existing memory location on the device as cublasMatrix, or is that possible?
In other words: can you leave out cublasSetMatrix and cublasAlloc if the vector has already been allocated by cudamalloc and filled with data on the device?

Pages 10-11 of the CUBLAS guide hold the answers you are looking for. cublasAlloc is just a wrapper for cudaMalloc, and you can share device pointers between CUBLAS and your own kernels, with all the usual cavaets.

thanks, that was helpful.