Problems with cudaMalloc(), on printf() just in the kernelcode

Hi friends

Men Env is:
WIN XP64
VC2008

More see below.
I wrote a kernel with a printf() in it.
I cudaMalloc() Device memory,
and get return code 10200,
0x0000000180034350 “unspecified driver error”
The kernel is not called yet from the hist code
or any other kernel.
If I remove the printf() from the
kernel, all is fine,
but i want to use printf()
in the kernel eventually.
Thank you in advance

Device 0: “GeForce GTX 460”
CUDA Driver Version: 3.20
CUDA Capability Major revision number: 2
CUDA Capability Minor revision number: 1
Total amount of global memory: 2147024896 bytes
Number of multiprocessors: 7
Number of cores: 224
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Clock rate: 1.40 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: Yes
Device has ECC support enabled: No

Hi friends

Men Env is:
WIN XP64
VC2008

More see below.
I wrote a kernel with a printf() in it.
I cudaMalloc() Device memory,
and get return code 10200,
0x0000000180034350 “unspecified driver error”
The kernel is not called yet from the hist code
or any other kernel.
If I remove the printf() from the
kernel, all is fine,
but i want to use printf()
in the kernel eventually.
Thank you in advance

Device 0: “GeForce GTX 460”
CUDA Driver Version: 3.20
CUDA Capability Major revision number: 2
CUDA Capability Minor revision number: 1
Total amount of global memory: 2147024896 bytes
Number of multiprocessors: 7
Number of cores: 224
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Clock rate: 1.40 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: Yes
Device has ECC support enabled: No

I solved the problem.

It was a driver problem.

I had 260.89,

I installed the lower 260.61 over it,

confirming the ‘outdated/overwrite?’ MSG-Boxes.

Then I installed 260.89

again, and it works,

the DOS-Command even

redirect to files.

Great.

Just reinstalling 260.89

did not help.

I solved the problem.

It was a driver problem.

I had 260.89,

I installed the lower 260.61 over it,

confirming the ‘outdated/overwrite?’ MSG-Boxes.

Then I installed 260.89

again, and it works,

the DOS-Command even

redirect to files.

Great.

Just reinstalling 260.89

did not help.

Surprised that the downgrade and re-install made it work again! Oops…

Surprised that the downgrade and re-install made it work again! Oops…

I have never known that prinf() can be used in the kernel?

May be cuPrintf() is just what you want.