Multi-GPU Memory Allocation behaves differently with different order of allocation

I have tested this on GTX 690 GPU with 4 GB RAM on Windows 7 x64, Visual C++ 10:

I want to allocate 1.2 GB RAM on each of the two devices. If I get the ram from the first device and then the second one, it fails and returns Memory Allocation error, but when I first get it from device 1 and then device 0, it has no problem. Can any one tell me why?

This Code Fails:

{
	void	* pM1 , * pM2 ;

	CudaCheck( cudaSetDevice( 0 ) ) ;
	CudaCheck( cudaMalloc( & pM1 , 1200000000 ) ) ;
	CudaCheck( cudaSetDevice( 1 ) ) ;
	CudaCheck( cudaMalloc( & pM2 , 1200000000 ) ) ;	
}

This Code Works:

{
	void	* pM1 , * pM2 ;

	CudaCheck( cudaSetDevice( 1 ) ) ;
	CudaCheck( cudaMalloc( & pM1 , 1200000000 ) ) ;
	CudaCheck( cudaSetDevice( 0 ) ) ;
	CudaCheck( cudaMalloc( & pM2 , 1200000000 ) ) ;	
}

Bests,
Ramin

P.S. CudaCheck function just checks the output result.

Problem solved. The problem was due to SLI being active. I disabled it and now it is working smoothly.