cudaFreeHost throwing first chance exception cudaFreeHost acting strangely when called on different

I am using Cuda SDK 4.0 and am encountering an issue which has taken me 2 days to whittle down into the following code.

#include <cuda.h>

#include <cuda_runtime.h>

void main (int argc, char ** argv) {

int* test;

    cudaError_t err;

err = cudaSetDevice(   1   ); err = cudaMallocHost(&test, 1024*sizeof(int));    

    err = cudaSetDevice(   0   ); err = cudaFreeHost(test);    

}

This throws the following error when calling cudaFreeHost:

First-chance exception at 0x000007fefd96aa7d in Test.exe: Microsoft C++ exception: cudaError_enum at memory location 0x0022f958..

The err value is cudaErrorInvalidValue

The same error occurs for this variation:

err = cudaSetDevice(   0   ); err = cudaMallocHost(&test, 1024*sizeof(int));    

err = cudaSetDevice(   1   ); err = cudaFreeHost(test);

The following variations dont throw the error:

err = cudaSetDevice(   0   ); err = cudaMallocHost(&test, 1024*sizeof(int));    

err = cudaSetDevice(   0   ); err = cudaFreeHost(test);

and

err = cudaSetDevice(   1   ); err = cudaMallocHost(&test, 1024*sizeof(int));    

err = cudaSetDevice(   1   ); err = cudaFreeHost(test);

I was under the impression you only needed to call cudaSetDevice if you want to allocate memory on a specific GPU. In the above example I am only allocating pinned memory on the CPU.

Is this a bug or did I miss something in the manual?

(I have posted the same question in StackOverflow here:

I haven’t tried this myself. However, I’d expect you either need to free the memory in the same context (i.e., for the same device) where the memory was allocated, or allocate the memory with the [font=“Courier New”]cudaHostAllocPortable[/font] flag set.