cudaMemcpyAsync in GPU 1

I made GPU program. If I run it on GPU 0., it works well.
But if I change the argument of my cudaSetDevice into 1, it gives me such error. : cudaSafeCall() Runtime API error 11: invalid argument.

-> cutilSafeCall(cudaMemcpyAsync(gv->size[stream_no].dev, gv->size[stream_no].host, BATCH_NUM * sizeof(uint32_t), cudaMemcpyHostToDevice, stream));

I added the cudaSetDevice code in very beginning of the gpu functions, before memory allocation.
If I run sample code in cuda programming guide, it gives me that I have 2 devices.

Device 0 has compute capability 2.0.
Device 1 has compute capability 2.0.

If I want to use GPU 1, do I have more work to do except for cudaSetDevice?