CUDA 4.1 - maybe bug?

Hi,
I asked for help on this forum: CUDA - Copy device data to host? - Stack Overflow

BUT we discover, that Code works on CUDA 4.0, but with 4.1 doesnt work OR there is some other issue?

thx for help

Please define “doesn’t work”. Does the app crash (segfault, unspecified launch failure), run to completion and deliver entirely incorrect results, deliver numerical results slightly different CUDA 4.0?

Does the code check the status returned by every CUDA API call? In particular, does CUDA initialize successfully? For example, a CUDA 4.1 application might fail due to an out-of-date driver. The CUDA 4.1 download page at http://developer.nvidia.com/cuda-toolkit-41 has links for the download of matching drivers.

If after double checking for driver mismatch issues, you find that the app works flawlessly with CUDA 4.0, but does not work properly with CUDA 4.1, I would suggest filing a bug.

I describe problem on the stackoverflow page in discussion under answer(s).

I am on win 7 x64, I am using CUDA 4.1 and lastest developer drivers(from CUDA 4.1 download page - your link).

Main problem is that cudaMemcpy() return error=11(which is cudaErrorInvalidValue), all others call are success! But one user, using CUDA 4.0 and He has not problem-works fine.

But I am not sure, that this is bug, so If someone can try code, under 4.1 and let me know If it works, thx?

Here is something you can try. Place a cudaFree(0) before any other CUDA API calls in your code. Check its return status. If it is not cudaStatusSuccess this is a good indication that CUDA failed to initialize. The most common reason for that is a mismatch between the CUDA runtime and the CUDA driver, in particular using an older driver with a more recent CUDA runtime. That is why I suggested double checking your software installation. Did you encounter any issues when you installed the new CUDA drivers that may indicate that the installation may have been incomplete? Sorry for the scant advice, driver issues are not an area where I have much experience.

I add cudaFree(0) on first line and it returns success - which is right.

But still, I will un-install driver and try install again. When I installed last driver I am just re-install old, so maybe, there is problem.

I uninstalled old driver and install new and I am getting same error, so really I dont know, where can be problem :{

Can anyone confirm that my code doesnt work under CUDA 4.1???