Trouble Running MC-GPU v1.3 with CUDA 5.0


I am currently trying to get MC-GPU up and running it is a self contained GPU based Monte Carlo simulator for CT dose calculations that must be ran in a linux environment. I am new to linux and CUDA and find myself running into a few problems. I understand that these may be trivial, but it never hurts to ask.

First off, let me explain my situation.

Using a Lenovo y400 Laptop with Nvidia 650m
Running Ubuntu 12.10
Installed Cuda-5.0, all samples compiled and I ran quite a few to test that they worked
Using Proprietary driver, not dev-driver that came with Cuda-5.0
Wrote a quick “Hello world” cuda program that compiled and ran
I have compiled with the given lines in the code to create the MC-GPU_v1.3.x and run the simple geometry using …/MC-GPU_v1.3.x | tee MC-GPU_v1.3_6voxels.out

I can compile and run the simple geometery code using the CPU compilation but the GPU part I cannot get to work. I haven’t changed anything from the source code and have been only trying to run the given sample simulation so far.

In order to run I must turn off the Xserver so I switch to console, disable the Xserver by calling service lightdm stop and init 3.
When I try to run the code after doing this, I get all print outs to the point of where it states: starting the Monte Carlo Loop Phase and then it tells me that I am executing 7813 blocks of 128 threads with 100 histories in each thread for a total of 100006400 histories in total. After this output, I get an error from line 891 in that !!Kernel execution failed while simulating particle tracks!! : (4) unspecified launch failure.

After looking at the code it looks like I am getting the error from where the code first tries to access the memory of the GPU with the <<< >>> brackets, but I am still unsure why I am getting these errors. Since I am running a sample I am hoping that this is a simple problem of not compiling something correctly or missing a step in attempting the simulation. Please let me know if anyone has time to help or if I should share any other information.



It won’t let me edit my post but the site for MC-GPU is,

MC-GPU is a somewhat specialized application and the authors would probably be best suited to answer this one.

I will say that MC codes for particle transport will usually require a lot of memory to store histories. It is possible that your laptop videocard does not have enough.

However, Google tells me it has 2GB, which should of course be enough. And since this is presumably a photon-only code, that requirement perhaps does not even exist for this specific code.

You might also want to try to compile in debug mode and run it through the Eclipse Nsight debugger and see exactly what is happening.

I don’t run Linux, so if it is a simple Linux-related setting, someone else will have to chime in!

Unspecified launch failure means the kernel was launched on the GPU but was terminated abnormally. By far the most common reason for this is the GPU equivalent of a segfault, an inapprpriate memory access of some sort.

As a first step, I would suggest running the code under cuda-memcheck. Also, make sure that the status of all CUDA API calls is checked, in particular GPU memory allocation. Running the app in the debugger is of course also a good way to get to the bottom of it. Personally I prefer running cuda-memcheck as an initial lightweight quick check.

Thank you for your response! It is a photon only code with no electron transport shown.

I know MC codes are fairly complicated but I cannot even get to that part to start working yet. I thought that since I haven’t even made it past the built in examples that can be treated like a black-box since I am thinking my error might be a simple running error.

Unfortunately I cannot run the Eclipse debugger since I am on a laptop with only one GPU and I have to turn of the graphics display to run the code (The $sudo lightdm service stop command is ran in a console and disables the X-server).

I have asked the creators for help as well, but I am not sure if they will respond to my request or not, but good idea!


I ran it with cuda memcheck and it just hung for a few hours before I killed it. Is there another debugger other than eclipse that doesn’t need a gui since I cannot run the code with the xserver on?

I think there may be an issue with the hardware, my friend ran the code on a desktop with 2 GPUs, one dedicated to CUDA, with no problem. Are there any adjustments to the compiling or running that needs to be made for laptops?

It would help to know the exact GPU your friend got the code to work on… and of course if he used the same inputs to the code as well. It’s hard to say for sure what the issue at hand is, but I also agree that the authors of the code would probably have a better idea.

I figured it out, the installs were fine his desktop gpu (still unsure of the model) was able to handle the number Monte Carlo inputs while my laptop could not. Since I was running a predefined script I just adjusted the kernel inputs and run the job multiple times and compile the results. Thanks for all the input!