CudaMalloc?

ardisschool10 · December 2, 2010, 1:35pm

Hi all,

I am new at cuda programming. I am working on a project for parallelizing a sequential program.

I began using the cuda 2.3 version on a GPU X295 card. Then I obtained an access to a server with a Fermi card with cuda 3.0 on it.

When I try to run my program on this server with Fermi on it, I have a segmentation fault on the first cudaMalloc of the program.

I have tried some simple programs and they run well on the Fermi.

The messages I have from cuda-gdb are:

Breakpoint 1, cuda_allocation () at kernel.cu:133
133
size_mat=mat_length_maxVProd(cells)sizeof(int);
Current language: auto; currently c++
(cuda-gdb) n
134 size_site=nSitessizeof(Site);
(cuda-gdb)
135 size_neigh=nebrTabMaxsizeof(int);
(cuda-gdb)
136 size_nebrEl=nSitessizeof(int);
(cuda-gdb)
137 size_inertia = nTypes;
(cuda-gdb)
138 size_massDev = nTypes;
(cuda-gdb)
140 size_interactionTypeDev1 = nebrTabMax;
(cuda-gdb)
141 size_sigDev = nTypesnTypes;
(cuda-gdb)
142 size_epsDev = nTypesnTypes;
(cuda-gdb)
143 size_productElectroMomentsDev = nTypesnTypes;
(cuda-gdb)
145 widthTex = ceil(sqrt(nSites));
(cuda-gdb)
147 channelDesc_V = cudaCreateChannelDesc();
(cuda-gdb)
149 cudaMalloc((void**)&mat, size_mat);
(cuda-gdb)
Segmentation fault

The program runs normally (I guess with an accentuated latency), only when I set the -G options te the nvcc Cuda compilator, otherwise it fails with a segmentation fault exactly in the lines I described above.

I paste here what is written in the core generated from the failure of the program:

<<<<<<<<<<<<<<<< <<<<<<<<<<<<<<<< <<<<<<<<<<<<<<<< <<<<<<<<<<<<<<<< <<<<<<<<<<<<<<<< <<<<<<<<<<<<<<<< <<<<<<<<<<<<<

This is free software: you are free to change and redistribute it.

There is NO WARRANTY, to the extent permitted by law. Type “show copying”

and “show warranty” for details.

This GDB was configured as “x86_64-linux-gnu”.

For bug reporting instructions, please see:

http://www.gnu.org/software/gdb/bugs/…

Reading symbols from /home/shkurti/Prove/13000configuration/program…done.

[New Thread 13970]

warning: Can’t read pathname for load map: Input/output error.

Reading symbols from /lib/libm.so.6…Reading symbols from /usr/lib/debug/lib/libm-2.11.1.so…done.

done.

Loaded symbols for /lib/libm.so.6

Reading symbols from /usr/lib/libcuda.so.1…(no debugging symbols found)…done.

Loaded symbols for /usr/lib/libcuda.so.1

Reading symbols from /usr/local/cuda/lib64/libcudart.so.3…(no debugging symbols found)…done.

Loaded symbols for /usr/local/cuda/lib64/libcudart.so.3

Reading symbols from /lib/libc.so.6…Reading symbols from /usr/lib/debug/lib/libc-2.11.1.so…done.

done.

Loaded symbols for /lib/libc.so.6

Reading symbols from /usr/lib/libstdc++.so.6…(no debugging symbols found)…done.

Loaded symbols for /usr/lib/libstdc++.so.6

Reading symbols from /lib/libpthread.so.0…Reading symbols from /usr/lib/debug/lib/libpthread-2.11.1.so…done.

done.

Loaded symbols for /lib/libpthread.so.0

Reading symbols from /lib/libz.so.1…(no debugging symbols found)…done.

Loaded symbols for /lib/libz.so.1

Reading symbols from /lib/libdl.so.2…Reading symbols from /usr/lib/debug/lib/libdl-2.11.1.so…done.

done.

Loaded symbols for /lib/libdl.so.2

Reading symbols from /lib/librt.so.1…Reading symbols from /usr/lib/debug/lib/librt-2.11.1.so…done.

done.

Loaded symbols for /lib/librt.so.1

Reading symbols from /lib/libgcc_s.so.1…(no debugging symbols found)…done.

Loaded symbols for /lib/libgcc_s.so.1

Reading symbols from /lib64/ld-linux-x86-64.so.2…Reading symbols from /usr/lib/debug/lib/ld-2.11.1.so…done.

done.

Loaded symbols for /lib64/ld-linux-x86-64.so.2

Core was generated by `./program’.

Program terminated with signal 11, Segmentation fault.

#0 0x00007f75a7ae555e in ?? () from /usr/lib/libcuda.so.1

(gdb) bt

#0 0x00007f75a7ae555e in ?? () from /usr/lib/libcuda.so.1

#1 0x00007f75a7aea99e in ?? () from /usr/lib/libcuda.so.1

#2 0x00007f75a7afdecd in ?? () from /usr/lib/libcuda.so.1

#3 0x00007f75a7b03729 in ?? () from /usr/lib/libcuda.so.1

#4 0x00007f75a7b048b5 in ?? () from /usr/lib/libcuda.so.1

#5 0x00007f75a7b07ee4 in ?? () from /usr/lib/libcuda.so.1

#6 0x00007f75a7b08e2a in ?? () from /usr/lib/libcuda.so.1

#7 0x00007f75a7b0921c in ?? () from /usr/lib/libcuda.so.1

#8 0x00007f75a7c69120 in ?? () from /usr/lib/libcuda.so.1

#9 0x00007f75a7c831db in ?? () from /usr/lib/libcuda.so.1

#10 0x00007f75a794d082 in ?? () from /usr/lib/libcuda.so.1

#11 0x00007f75a78c1f75 in ?? () from /usr/lib/libcuda.so.1

#12 0x00007f75a79020a6 in ?? () from /usr/lib/libcuda.so.1

#13 0x00007f75a78c1c55 in ?? () from /usr/lib/libcuda.so.1

#14 0x00007f75a78c13ed in ?? () from /usr/lib/libcuda.so.1

#15 0x00007f75a78baf91 in ?? () from /usr/lib/libcuda.so.1

#16 0x00007f75a7773d2a in ?? () from /usr/lib/libcuda.so.1

#17 0x00007f75a775bb29 in ?? () from /usr/lib/libcuda.so.1

#18 0x00007f75a7804ab9 in ?? () from /usr/lib/libcuda.so.1

#19 0x00007f75a74b51ce in ?? () from /usr/local/cuda/lib64/libcudart.so.3

#20 0x00007f75a74aabdb in ?? () from /usr/local/cuda/lib64/libcudart.so.3

#21 0x00007f75a74af00c in ?? () from /usr/local/cuda/lib64/libcudart.so.3

#22 0x00007f75a74a8f89 in cudaMalloc () from /usr/local/cuda/lib64/libcudart.so.3

#23 0x000000000043682f in cuda_allocation ()

#24 0x0000000000407aa1 in main (argc=1, argv=0x7fff93e07cc8) at main.c:266

Does anybody have any idea about how can I set right this kind of problem? External Image

Dittoaway · December 2, 2010, 4:31pm

Are you re-compiling for the 2.x architecture?

ardisschool10 · December 2, 2010, 5:10pm

The nvcc flags in my Makefile are these:

NVCC_FLAGS = -O0 -use_fast_math --ptxas-options=-v

Dittoaway · December 2, 2010, 6:49pm

I see you’re loading the pthread library.
Is this cudaMalloc inside a pthread?

ardisschool10 · December 3, 2010, 8:46am

No, it is not inside a pthread. The cudaMallocs are called in a function performed by the host.

avidday · December 3, 2010, 9:13am

How about posting some actual code?

ardisschool10 · December 3, 2010, 10:13am

Here I report the snippets. I have to underline that in the cuda 2.3 version in the x295 I have not a segmentation fault at this point. The problem is only when I run the program in the server with the Fermi card and cuda3.0.

int* mat;

size_mat=mat_length_maxVProd(cells)sizeof(int); // mat_length_max is a constant (in my case it has a value of 40) and VProd is a Macro which does cells.xcells.ycells.z (and in my case cells.x=cells.y=cells.z=2 )

cudaMalloc((void**)&mat, size_mat); // here there is a segmentation fault and this is the first cudaMalloc of the program

cudaError_t error=cudaGetLastError();if(error != cudaSuccess)printf(“cudaMalloc: %s\n”,cudaGetErrorString(error));

avidday · December 3, 2010, 3:06pm

Are you recompiling this code for the Fermi card with a CUDA 3.0 or later toolkit, or are you trying to run the CUDA 2.3 version directly on the Fermi machine? I am pretty certain the latter is not going to work.

ardisschool10 · December 3, 2010, 4:33pm

Yes, before I run the code on Fermi, I recompile the code with the CUDA 3.0 toolkit on the server provided with the Fermi GPU architecture.

Dittoaway · December 3, 2010, 4:56pm

Well then there’s something in your code that we’re not seeing.

You say you have:

int* mat;
int size_mat=320*sizeof(int);
cudaMalloc((void**)&mat, size_mat);

There is nothing wrong with that code and it runs perfectly well on Fermi with cuda 3.0

The problem cannot be here.

jpapon · December 14, 2010, 9:36am

I have the same problem, also moving from a 200 series card to my brand new 580s.
Compilation works fine, no warnings, but the first cudaMalloc that gets called returns an invalid argument statement.
I don’t see how that’s possible, since this is all the call is:
SC( cudaMalloc((void**)&m_pSimInput_CUDA, 3 * 256 * 320 * sizeof(unsigned char)) );

where m_pSimInput is in the class header:
unsigned char *m_pSimInput_CUDA;

and SC is just a macro for CUDA_SAFE_CALL

All of the SDK examples compile and run just fine of course.

As with ardis, everything runs fine on my 295 machine.

jpapon · December 14, 2010, 3:34pm

Solved, was using some binaries compiled pre-Fermi of course. I forgot I was linking to them…

Topic		Replies	Views
cudaMalloc in cuda 3.0, Segmentation fault on cudaMalloc CUDA Programming and Performance	0	861	December 1, 2010
segmentation fault at the first cudaMalloc with --device-emulation everything was fine CUDA Programming and Performance	10	4317	January 25, 2010
cuda.h error message CUDA Programming and Performance	9	5973	October 22, 2009
Can we do malloc inside a __global__ function CUDA Programming and Performance	26	9628	February 21, 2010
CUDA compile trouble CUDA Programming and Performance	47	5104	November 8, 2010
cudaMalloc in cuda 3.0 Segmentation fault on cudaMalloc CUDA Programming and Performance	1	1738	October 6, 2010
Compilation problem: CUDA is broken? CUDA Programming and Performance	11	13704	September 14, 2011
gcc 4.4 support anytime soon? CUDA Programming and Performance	24	108097	April 9, 2010
cudaMalloc error in big loop CUDA Programming and Performance	12	15583	May 21, 2008
CUDA Toolkit 3.0 update GPU HW debugging tools to replace device emulation CUDA Programming and Performance	44	29433	April 29, 2010

CudaMalloc?

Related topics