2nd context creation fails on tesla C2050

I have a problem that seems to be seen only on a Tesla C2050.

Our application creates multiple contexts for the same device.
This works fine on most cards.
But on our Tesla C2050 creating a 2nd context while a context is already active fails with a dreaded “unknown error” (we have to get better error reporting–is there some log file somewhere which can tell you what actually went wrong?).

I have modified one of the SDK samples to show the problem. The source is attached.

When run on my development system which has 2 GTX 480s and 1 C2050 it reports this:

CUDA Device Query (Driver API) statically linked version
There are 3 devices supporting CUDA

Device 0: “GeForce GTX 480”
CUDA Driver Version: 3.10
CUDA Capability Major revision number: 2
CUDA Capability Minor revision number: 0
Total amount of global memory: 1576468480 bytes
created first context 0000000002241C10
created second context 0000000002999250

Device 1: “GeForce GTX 480”
CUDA Driver Version: 3.10
CUDA Capability Major revision number: 2
CUDA Capability Minor revision number: 0
Total amount of global memory: 1576599552 bytes
created first context 0000000002241C10
created second context 000000000293C2E0

Device 2: “Tesla C2050”
CUDA Driver Version: 3.10
CUDA Capability Major revision number: 2
CUDA Capability Minor revision number: 0
Total amount of global memory: 3181969408 bytes
created first context 0000000002241C10
FAILURE: could not create 2nd context (error=999)

FAILURES

You can see it works for the GTX 480s but not the tesla.
The same result is obtained on a system with a Quadro FX5800 and a tesla C2050 (the FX5800 works, the tesla does not).

This was tested with both the 258.96 and 259.03 server driver.

I will also file a bug report.
-Derek Ney

I am not sure the attachment facility is working so here is the code (it is not long)

/*

  • Copyright 1993-2010 NVIDIA Corporation. All rights reserved.
  • NVIDIA Corporation and its licensors retain all intellectual property and
  • proprietary rights in and to this software and related documentation.
  • Any use, reproduction, disclosure, or distribution of this software
  • and related documentation without an express license agreement from
  • NVIDIA Corporation is strictly prohibited.
  • Please refer to the applicable NVIDIA end user license agreement (EULA)
  • associated with this source code for terms and conditions that govern
  • your use of this NVIDIA software.

*/

/* This sample queries the properties of the CUDA devices present in the system. */

// includes, system
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

#include <cuda.h>

#include <cutil.h>

// utilities and system includes
#include <shrUtils.h>

////////////////////////////////////////////////////////////////////////////////
// Program main
////////////////////////////////////////////////////////////////////////////////
int
main( int argc, char** argv)
{
CUdevice dev;
int major = 0, minor = 0;
int deviceCount = 0;
char deviceName[256];
bool failed = false;
int options = CU_CTX_SCHED_YIELD;

// note your project will need to link with cuda.lib files on windows
printf("CUDA Device Query (Driver API) statically linked version \n");

CUresult err = cuInit(0);
    CU_SAFE_CALL_NO_SYNC(cuDeviceGetCount(&deviceCount));
// This function call returns 0 if there are no CUDA capable devices.
if (deviceCount == 0) {
    printf("There is no device supporting CUDA\n");
}
    for (dev = 0; dev < deviceCount; ++dev) {
	CU_SAFE_CALL_NO_SYNC( cuDeviceComputeCapability(&major, &minor, dev) );

    if (dev == 0) {
		// This function call returns 9999 for both major & minor fields, if no CUDA capable devices are present
        if (major == 9999 && minor == 9999)
            printf("There is no device supporting CUDA.\n");
        else if (deviceCount == 1)
            printf("There is 1 device supporting CUDA\n");
        else
            printf("There are %d devices supporting CUDA\n", deviceCount);
    }
	CU_SAFE_CALL_NO_SYNC( cuDeviceGetName(deviceName, 256, dev) );
    printf("\nDevice %d: \"%s\"\n", dev, deviceName);

    int driverVersion = 0;
    cuDriverGetVersion(&driverVersion);
    printf("  CUDA Driver Version:                           %d.%d\n", driverVersion/1000, driverVersion%100);
    printf("  CUDA Capability Major revision number:         %d\n", major);
    printf("  CUDA Capability Minor revision number:         %d\n", minor);

	unsigned int totalGlobalMem;
	CU_SAFE_CALL_NO_SYNC( cuDeviceTotalMem(&totalGlobalMem, dev) );
    printf("  Total amount of global memory:                 %u bytes\n", totalGlobalMem);

    CUresult res;
    CUcontext ctx;
    CUcontext ctx2;
    
    if ((res = cuCtxCreate(&ctx, options, dev)) != CUDA_SUCCESS)
    {
      printf("FAILURE: could not create initial context (error=%d)\n", res);
      failed = true;
      continue;
    }

    printf("  created first context %p\n", ctx);
    
    if ((res = cuCtxCreate(&ctx2, options, dev)) != CUDA_SUCCESS)
    {
      printf("FAILURE: could not create 2nd context (error=%d)\n", res);
      failed = true;
      continue;
    }

    printf("  created second context %p\n", ctx2);

    if ((res = cuCtxDetach(ctx2)) != CUDA_SUCCESS)
    {
      printf("FAILURE: could not detach from 2nd context (error=%d)\n", res);
      failed = true;
      continue;
    }

    if ((res = cuCtxDetach(ctx)) != CUDA_SUCCESS)
    {
      printf("FAILURE: could not detach from initial context (error=%d)\n", res);
      failed = true;
      continue;
    }
  }

if (failed)
  printf("\nFAILURES\n");
else
  printf("\nPASSED\n");

CUT_EXIT(argc, argv);

}

Nvidia helped me with this problem. Here is the information they gave me:

Turns out there is a configuration setting that is enabled by default for Tesla C2050 cards. It is the “compute mode”. It has 3 values:

0 = normal
1 = exclusive
2 = prohibited

It can be manipulated with the nvidia-smi program which is normally installed here: C:\Program Files\NVIDIA Corporation\NVSMI\nvidia-smi.exe.

nvidia-smi -L -a will list GPUs,
nvidia-smi -g (GPU ID) -s will show the GPU’s current compute mode,
nvidia-smi -g (GPU ID) -c (compute mode) will change the compute mode.

Indeed the -s command showed that my C2050 ws in mode 1. In mode 1 only one context can be created per card.
I am not sure why this is the default. I changed it to zero with this command: nvidia-smi -g 2 -c 0. And then my test
program worked.

Nvidia helped me with this problem. Here is the information they gave me:

Turns out there is a configuration setting that is enabled by default for Tesla C2050 cards. It is the “compute mode”. It has 3 values:

0 = normal
1 = exclusive
2 = prohibited

It can be manipulated with the nvidia-smi program which is normally installed here: C:\Program Files\NVIDIA Corporation\NVSMI\nvidia-smi.exe.

nvidia-smi -L -a will list GPUs,
nvidia-smi -g (GPU ID) -s will show the GPU’s current compute mode,
nvidia-smi -g (GPU ID) -c (compute mode) will change the compute mode.

Indeed the -s command showed that my C2050 ws in mode 1. In mode 1 only one context can be created per card.
I am not sure why this is the default. I changed it to zero with this command: nvidia-smi -g 2 -c 0. And then my test
program worked.