cudaHostAlloc --> invalid argument it works with fermi, not with 9800gtx

neoideo · February 3, 2011, 3:00pm

hello,

im having a small but critical problem on one program i did.

its using zero copy and it works under fermi.

however, when i plugged in the 9800gtx, im getting this error

src/myCudaCalls.cu(302) : cudaSafeCall() Runtime API error : invalid argument.

and the line is this one:

cutilSafeCall(cudaSetDeviceFlags(cudaDeviceMapHost));

cutilSafeCall(cudaHostAlloc((void **)&h_listo, sizeof(int), cudaHostAllocMapped));    --> This is line 302

cutilSafeCall(cudaHostGetDevicePointer((void **)&d_listo, (void *)h_listo, 0));

what could it be??

im compiling with -arch sm_11 and from what i know this gpu supports zero copy.

also, the sdk example simpleZerocopy uses exactly the same instructions, but it works, even when i manually compiled it with the same options.

avidday · February 3, 2011, 3:54pm

The 9800 GTX doesn’t support zero copy. The only compute 1.1 devices that support zero copy are the MCP79 family of integrated GPUs (9300M,9400M,Ion).

neoideo · February 3, 2011, 4:44pm

Thanks for the information, for one for one moment i thought that too, but the fact that the example from SDK “simpleZeroCopy” worked with the same lines of code made me re-think of it, at least i have a contradiction on my situation and i really dont have an answer to this

do you know what could be the trick behind?

avidday · February 13, 2011, 8:56am

In this post you posted the output of deviceQuery for your card. Look at the output line “Support host page-locked memory mapping”. It says No. This corresponds to the device property canMapHostMemory. Quoting from the pinned memory API documentation (Â§3.1):

and further on in the FAQ section of same document is says (Â§4.1):

Like I said, your card doesn’t support zero copy memory.

apostglen46 · November 11, 2011, 11:15am

I am having sort of the same problem here but the device properties output the following:
Device 0: “GeForce 8400M GS”
CUDA Driver Version: 4.10
CUDA Runtime Version: 4.10
CUDA Capability Major/Minor version number: 1.1
Total amount of global memory: 133496832 bytes
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 256 bytes
Clock rate: 0.80 GHz
Concurrent copy and execution: Yes

of Asynchronous Copy Engines: 1

Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default (multiple host threads can use this device simultaneously)
Concurrent kernel execution: No
Device has ECC support enabled: No
Device is using TCC driver mode: No

The line that is causing the problem is the following
CUDA_SAFE_CALL(cudaHostAlloc((void**)&(h_pGMMRet->h_pinnedIn), h_pGMMRet->nInputImgSize, cudaHostAllocMapped));

Any help?
Do you see anything wrong with the code?

Apostolis

apostglen46 · November 11, 2011, 12:07pm

I found what went wrong.
For some reason i had to do :
cutilSafeCall(cudaSetDeviceFlags(cudaDeviceMapHost));
before setting the device.

That fixed my problem.

Topic		Replies	Views
cutilSafeCall() Runtime API error: Invalid Argument CUDA Programming and Performance	5	1254	September 16, 2016
cudaMemcpyAsync not behaving asynchronously CUDA Programming and Performance	5	2446	July 4, 2008
cudaMemcpy3D invalid argument CUDA Programming and Performance	5	14400	August 2, 2010
cudaHostAlloc not working cudaHostAlloc returning CUDA_ERROR_INVALID_IMAGE CUDA Programming and Performance	2	2094	January 2, 2011
cudaHostAlloc on 2.2 CUDA Programming and Performance	3	1123	June 9, 2009
CUDA Zero Copy On TX1 Jetson TX1	20	6827	October 18, 2021
how to make zero copy work CUDA Programming and Performance	9	9702	September 22, 2009
Zero-copy from different threads CUDA Programming and Performance	2	6564	May 13, 2009
Async memory problems CUDA Programming and Performance	7	7245	February 11, 2011
zero copy : Device 0 cannot map host memory! zero copy not working, unable to map host memory? CUDA Programming and Performance	4	6466	June 9, 2009

cudaHostAlloc --> invalid argument it works with fermi, not with 9800gtx

of Asynchronous Copy Engines: 1

Related topics