CUDA 3.2.16 not work on GTX 460 with xorg-1.9.2 possible CUDA driver bug

Dear CUDA driver makers or other experts,
Could you please fix bug described below.

When I tried to launch examples final release of CUDA 3.2.16 SDK,
I received “kernel launch failed” errors from some of them.

List of errors:

  1. alignedTypes: all tests gave “TEST FAILED” (

alignedTypes.txt (1.34 KB)

).
2. BlackScholes: “unspecified launch failure” (

BlackScholes.txt (407 Bytes)

).
3. EstimatePiInlineQ: “unspecified launch failure” (

EstimatePiInlineQ.txt (389 Bytes)

).
4. FDTD3d: “Data error at point (204,0,0) 18.547146 instead of 18.555979” (

FDTD3d.txt (1.13 KB)

).
5. MersenneTwister: results differs from CPU, “FAILED” (

MersenneTwister.txt (638 Bytes)

).
6. boxFilter: constant rectangular graphical artifacts showing video memory corruption.
7. conjugateGradient: failed to converge after 10000 iterations.
8. convolutionFFT2D: results differs from CPU, “FAILED” (

convolutionFFT2D.txt (1.45 KB)

).
9. convolutionSeparable: results differs from CPU, “FAILED” (

convolutionSeparable.txt (635 Bytes)

).
10. convolutionTexture: results differs from CPU, “FAILED” (

convolutionTexture.txt (593 Bytes)

).
11. fastWalshTransform: “L2 norm: NAN” (

fastWalshTransform.txt (397 Bytes)

).
12. marchingCubes: graphical artifacts.
13. mergeSort: “unspecified launch failure” (

mergeSort.txt (223 Bytes)

).
14. radixSort: “Incorrectly sorted value[694] (115713): 2789216 != 692064” (

radixSort.txt (401 Bytes)

).
15. reduction: results differs from CPU, “FAILED” (

reduction.txt (392 Bytes)

).
16. scalarProd: results differs from CPU, “FAILED” (

scalarProd.txt (398 Bytes)

).
17. scan: results differs from CPU, “FAILED” (

scan.txt (3.61 KB)

).
18. simplePitchLinearTexture: “FAILED” (

simplePitchLinearTexture.txt (233 Bytes)

).
19. simpleStreams: “FAILED” (

simpleStreams.txt (388 Bytes)

).
20. sortingNetworks: “corrupted keys” (

sortingNetworks.txt (4.39 KB)

).
21. transpose: “kernel FAILED” (

transpose.txt (3.05 KB)

).

Used hardware and software:
CPU: Intel Core 2 Duo E7500
GPU: Geforce GTX 460 (

deviceQuery.txt (1.74 KB)

,

deviceQueryDrv.txt (1.35 KB)

)
CUDA toolkit: 3.2.16 final
GPU driver: 260.19.29
Kernel: 2.6.34
X.org: 1.9.2
GCC: 4.4.4

When I replaced GPU by Geforce 9600 GT (compute capability 1.1),
errors disappeared. So bug related to Fermi generation GPUs.

When I downgraded to X.org 1.7.7 errors were gone also.
Obviously it’s some software related bug somewhere in CUDA driver
or driver interaction with X.org 1.9.x branch.

Stick with X.org < 1.9.x is not a problem with enterprise Linux flavours,
like RHEL or Ubuntu LTS. But more and more distributions switched to
X.org 1.9.x branch. So it will be necessary sometime to fix this bug.

Could you please reproduce and fix this error in next Linux
driver release?