cudaDeviceEnablePeerAccess fails Enabling peer-to-peer device memory access fails

When enabling peer memory access in my code, it fails with an error. Refer following code snippet (from my file “”):
int canAccessPeer = 0;
if (cudaSuccess == cudaDeviceCanAccessPeer(&canAccessPeer, 0, 1))
if (canAccessPeer == 1)

The return value of “canAccessPeer” is 0.
The “CudaCheckError()” fails with following error:
cudaCheckError() failed at : invalid device ordinal.

My system information:

Two Tesla M2070
64 bit, CentOS 5.6

NVRM version: NVIDIA UNIX x86_64 Kernel Module 270.41.19 Mon May 16 23:32:08 PDT 2011
GCC version: gcc version 4.1.2 20080704 (Red Hat 4.1.2-50)
CUDA Driver Version / Runtime Version 4.0 / 4.0
CUDA Capability Major/Minor version number: 2.0

Also the output of SDK sample “simpleP2P”:

Checking for multiple GPUs…
CUDA-capable device count: 2

GPU0 = " Tesla M2070" IS capable of Peer-to-Peer (P2P)
GPU1 = " Tesla M2070" IS capable of Peer-to-Peer (P2P)

Checking GPU(s) for support of peer to peer memory access…

Peer access from Tesla M2070 (GPU0) -> Tesla M2070 (GPU1) : No
Peer access from Tesla M2070 (GPU1) -> Tesla M2070 (GPU0) : No
Two or more Tesla(s) with class GPUs are required for ./simpleP2P to run.
Support for UVA requires a Tesla with SM 2.0 capabilities.
Peer to Peer access is not available between GPU0 <-> GPU1, waiving test.


This error also occurs with 2 Tesla M2075 when using 4.1 release on 64 bit CentOS 6.0. Below is the code snippet from SDK example “”
// Enable peer access
printf(“Enabling peer access between GPU%d and GPU%d…\n”, gpuid[0], gpuid[1]);
checkCudaErrors(cudaDeviceEnablePeerAccess(gpuid[1], 0));
checkCudaErrors(cudaDeviceEnablePeerAccess(gpuid[0], 0));
Here is the error when executing it: : CUDA Runtime API error 10: invalid device ordinal

NOTE: I commented-out the “exit” statement in so that the API “cudaDeviceEnablePeerAccess” is called.