didu31
July 14, 2011, 10:30am
1
All is in the title.
I don’t know why but this code
__global__ void produitMatricielKernel(struct cudaPitchedPtr A, struct cudaPitchedPtr B, struct cudaPitchedPtr C)
{
const int& i = threadIdx.x;
const int& j = threadIdx.y;
const int& k = threadIdx.z;
#ifdef __CUDA_ARCH__
#if (__CUDA_ARCH__ >= 200)
printf("threadIdx.x = %d\nthreadIdx.y = %d\nthreadIdx.z %d\n", threadIdx.x, threadIdx.y, threadIdx.z);
#else
#error "CUDA compute capability < 2.0"
#endif
#error "__CUDA_ARCH__ undefined"
#endif
generates :
src\produitMatrice.cu|17|fatal error C1189: #error : "__CUDA_ARCH__ undefined"|
Have you experienced same behaviour ?
By advance, thanks a lot.
tera
July 14, 2011, 10:36am
2
The code misses an [font=“Courier New”]#else [/font]. You probably intended to write
#ifdef __CUDA_ARCH__
...
#else
#error "__CUDA_ARCH__ undefined"
#endif
didu31
July 14, 2011, 10:54am
3
Well seen. Thank you. That’s stupid from my part : if I read carefully my code…
Nevertheless, I have a new problem
Error: External calls are not supported
experienced by the MS C++ Compiler (VS Express 2010)
I’m seeing that printf should be inlined … in the same compilation unit … I have no power of action on that…
I’m seeking.
tera
July 14, 2011, 11:01am
4
You need to [font=“Courier New”]#include <stdio.h>[/font] at the top of the code, just like on the CPU. Since CUDA doesn’t usually require any includes, it’s easy to forget that one.
didu31
July 14, 2011, 11:16am
5
I’ve solved this problem.
Now, I don’t know why output of printf in my kernel is not printed.
I will post if I give my tongue to the cat.
didu31
July 14, 2011, 11:29am
6
std::cout<<"Call to kernel produitMatricielKernel<<<1,dim3(m,n,p)>>>(A_on_GPU, B_on_GPU, C_on_GPU)"<<std::endl;
produitMatricielKernel<<<1,dim3(m,n,p)>>>(A_on_GPU, B_on_GPU, C_on_GPU);
std::cout<<"Return from kernel produitMatricielKernel<<<1,dim3(m,n,p)>>>(A_on_GPU, B_on_GPU, C_on_GPU)"<<std::endl;
Outputs :
[i]
Call to kernel produitMatricielKernel<<<1,dim3(m,n,p)>>>(A_on_GPU, B_on_GPU, C_on_GPU)
Return from kernel produitMatricielKernel<<<1,dim3(m,n,p)>>>(A_on_GPU, B_on_GPU, C_on_GPU)
[/i]
as expected.
__global__ void produitMatricielKernel(struct cudaPitchedPtr A, struct cudaPitchedPtr B, struct cudaPitchedPtr C)
{
const int& i = threadIdx.x;
const int& j = threadIdx.y;
const int& k = threadIdx.z;
printf("threadIdx.x = %d\nthreadIdx.y = %d\nthreadIdx.z %d\n", threadIdx.x, threadIdx.y, threadIdx.z);
...
outputs nothing.
I have added #include<stdio.h> as you suggested as header of file .cu but nothing better.
tera
July 14, 2011, 11:37am
7
Output from kernels is only printed when one of the actions listed in appendix B.14.2 of the Programming Guide is performed:
[*]Kernel launch via <<<>>> or cuLaunchKernel() (at the start of the launch, and if the CUDA_LAUNCH_BLOCKING environment variable is set to 1, at the end of the launch as well),
[*]Synchronization via cudaDeviceSynchronize(), cuCtxSynchronize(), cudaStreamSynchronize(), cuStreamSynchronize(), cudaEventSynchronize(), or cuEventSynchronize(),
[]Memory copies via any blocking version of cudaMemcpy () or cuMemcpy*(),
[*]Module loading/unloading via cuModuleLoad() or cuModuleUnload(),
[*]Context destruction via cudaDeviceReset() or cuCtxDestroy().
didu31
July 14, 2011, 11:43am
8
Yes, I’ve seen and I have added cudaDeviceSynchronize() after the kernel’s call.
And seems remaining the same.
didu31
July 14, 2011, 11:47am
9
I’ve commented the rest of the kernel code, and kernel’s output works, now.
I knew that I have a bug. It seems output depends of the correct completion of the kernel.
I don’t know if exceptions exist in CUDA. It seems not but the code seems aborting and output-buffer not flushed.
It’s by commiting errors that I learn. External Image
didu31
July 14, 2011, 12:42pm
10