Hi,
I recently migrated my development environment. Before migration, things were working fine. But since migration, I am unable to compile executable and DLLs with expected results. Even running simple vector addition example is returning zero. My old and new development environments are listed below.
Old env: Win 7, x64, CUDA 5.5 with VS2008 with Nsight 3.3 plugin, compilation setting for 32-bit
New env: Win 7, x64, CUDA 6.5 with VS2012 with Nsight 4.2 plugin, compilation setting for 32-bit
My GPU is NVIDIA GEFORCE 330M with the latest device drives that came with CUDA v6.5
The CUDA code that I am trying to compile into an executable is as follows.
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
// CUDA kernel - add to vectors element by element
// pass result back in the first vector
__global__ void vecAdd(float *a, float *b, int n)
{
int tid = blockIdx.x*blockDim.x+threadIdx.x;
if (tid < n)
a[tid] = a[tid] + b[tid];
}
// simple test with only 4 elements
int main( int argc, char* argv[] )
{
// host data
float *h_a;
float *h_b;
// device data
float *d_a;
float *d_b;
size_t bytes = 4*sizeof(float);
// allocate memory on host
h_a = (float*)malloc(bytes);
h_b = (float*)malloc(bytes);
// allocate memory on gpu-device
cudaMalloc(&d_a, bytes);
cudaMalloc(&d_b, bytes);
// initialize host data to constants
for(int i = 0; i < 4; i++) {
h_a[i] = 0.1;
h_b[i] = 0.2;
}
// copy host data to gpu-device
cudaMemcpy( d_a, h_a, bytes, cudaMemcpyHostToDevice);
cudaMemcpy( d_b, h_b, bytes, cudaMemcpyHostToDevice);
printf("Before CUDA...\n");
for(int i=0; i<4; i++)
printf("a[%d]=%f\n",i,h_a[i]);
// Execute the kernel
vecAdd<<<1, 4>>>(d_a, d_b, 4);
// Copy array back to host
cudaMemcpy( h_a, d_a, bytes, cudaMemcpyDeviceToHost );
printf("After CUDA...\n");
for(int i=0; i<4; i++)
printf("a[%d]=%f\n",i,h_a[i]);
// Release device memory
cudaFree(d_a);
cudaFree(d_b);
// Release host memory
free(h_a);
free(h_b);
return 0;
}
The compilation is done on command prompt using the following command.
nvcc -O3 -o mycudatest.exe mycudatest.cu
I used to see the results in the previous version of CUDA (5.5). Now, all I get is a bunch of zeros.
Any advice or suggestions are greatly appreciated.
Warm regards,
Sam V