This may not be quite the right place for this problem; if not, please let me know so I can post to the right subforum.
I’m trying to develop a framework for maximum-likelihood fits, in which the event probability evaluations are done on a GPU. I have some code that works on two separate C2050 boxes; I’ve also had a chance to test it on the new Kepler, and it worked there as well.
However, when I try running the same code on a laptop with a 650M, it crashes - both on my own laptop with Ubuntu 12.04, and on my colleague’s MacBook. Does anyone know of a difference between the laptop and desktop cards that might account for this?
I’ve had a look with cuda-gdb, but I must confess I don’t find the information very enlightening:
Error code 13 (invalid device symbol) at /home/rolfa/release_09Nov2012/FPOINTER/ThrustPdfFunctor.cu, 663
========= Program hit error 13 on CUDA API call to cudaMemcpyFromSymbol
========= Saved host backtrace up to driver entry point at error
========= Host Frame:/usr/lib/nvidia-current/libcuda.so [0x24e199]
========= Host Frame:/usr/local/cuda/lib/libcudart.so.5.0 (cudaMemcpyFromSymbol + 0x31a) [0x3a0ca]
The code which causes this invalid symbol error looks like so:
metricIndex = num_device_functions;
void* dummy[1];
std::cout << "Copying to " << localPtr << std::endl;
cutilSafeCall(cudaMemcpyFromSymbol(dummy, localPtr.c_str(), sizeof(void*))); // Line 663
host_function_table[num_device_functions] = dummy[0];
functionNameToDeviceIndexMap[localPtr] = num_device_functions;
num_device_functions++;
cutilSafeCall(cudaMemcpyToSymbol(device_function_table, host_function_table, num_device_functions*sizeof(void*)));
where ‘localPtr’ is a string naming the symbol to be copied. In this case it has the value “ptr_to_NLL”, and this variable is declared earlier in the same file:
typedef fptype (*device_metric_ptr) (fptype, fptype*, unsigned int);
__device__ fptype calculateNLL (fptype rawPdf, fptype* evtVal, unsigned int par) {
rawPdf *= normalisationFactors[par];
return rawPdf > 0 ? -LOG(rawPdf) : 0;
}
// (...)
__device__ device_metric_ptr ptr_to_NLL = calculateNLL;
cuda-gdb reports that the value of this pointer is zero. Could there be some difference in the way that global device variables are treated between the mobile and desktop versions of the drivers?
If anyone would like to try to reproduce this on their own systems, here is the code, for Mac and Ubuntu:
http://www.physics.uc.edu/~rolfa/GooFit_05Dec2012_standalone_Mac.tar.gz
http://www.physics.uc.edu/~rolfa/GooFit_05Dec2012_standalone.tar.gz
To install, just unpack, possibly edit the Makefile so ‘CUDALOCATION’ points to your install, run ‘gmake’, set LD_LIBRARY_PATH to include the subdirectory ‘rootstuff’, and run ‘gtest’.