Memory Requirements

I am currently evaluating on an Opteron 4 processor system.

With a large code written for one processor, the gcc compiler running a particular problem utilizes about 4GB memory. When compiling with pgcc the same application and data, the program uses nearly 20GB. We have set-up memory for node-interleave. The large overhead occurs whether compiling with just -g or various faster optimizations.

Is this overhead typical and will it remain (the system has 32 GB of memory).

Thanks,

Andy

Hi Andy,

We have not seen this type of issue before, and don’t have a good
explaination as of yet.

What is the output of commands ‘size exe’ and ‘ldd exe’ for the gcc and
pgcc built executables?

How are you determining memory usage?

Thanks,
Mat

Mat,

You bring up a good point on how memory was checked - I used the gnome-system-moniter. It cannot access individual process memory usage above 2GB. However, I have found it reliable in determining system load. I will perform tests to crash the system with too much memory to make sure. The program itself is relatively small, but we are using allot of memory.

ldd results:

libthread.so.o => /lib64/tls/libthread.so.0 0x0000003845a00000
libc.so.6 => /lib64/tls/libc.so.6 0x0000003844b00000
=> 0x0000003844b00000
/lib64/ld-linux-x86-64.so.2 0x0000003844900000

and size results:

text 178707
data 25788
bss 37760
dec 242255
hex 3b24f

Andy

As per last post, I double the data size for the job, and the process began to use swap space so I killed it. I think the g-s-m was correctly showing the amount of memory requested.

I was allocating memory in a peicemeal sloppy manner. For example:

x = (double **) calloc(dim1,sizeof(double *));
for (i=0; i<dim1; i++)
x = (double *) calloc(dim2,sizeof(double));

I could allocate as a continuous block and then reference with multi-dimensional pointers. For example:

dataBlock = (double *) malloc(dim1 * dim2 * sizeof(double);

and then have x rows of dim1 point to dataBlock columns of dim2.

Would that sort of memory allocation help?

Andy

Hi Andy,

The compiler doesn’t do the actual memory allocation. This is handled by the OS via the calloc and malloc system function calls so we’re a bit perplexed. Is it possible for you to send us the code so we can try to determine what’s going on?

I could allocate as a continuous block and then reference with multi-dimensional pointers. For example:
dataBlock = (double *) malloc(dim1 * dim2 * sizeof(double);
and then have x > rows of dim1 point to dataBlock columns of dim2.
Would that sort of memory allocation help?
>

_Either method should yield the same result. However, if you use the second method and the size given to malloc is greater that 2Gb, you’ll need to compile with ‘-mcmodel=medium’. You should also add ‘include <malloc.h>’, if you haven’t already, in order to ensure the correct 64-bit prototype is used for malloc.
\

  • Mat_

Hi Mat,

I was including stdlib.h and math.h but explicitly placed malloc.h in the code and it still behaved the same way.

I am using calloc heavily and dynamic memory.

One clue however is that I am also using free() because there is a bunch of memory bound in transient dynamic arrays. Could the problem be that free() is not freeing the memory in the case of the pgcc compiled code?

Andy

Hi Andy,

The reason for adding the malloc header file is that the default prototype for malloc is ‘(void *) malloc (int)’ but what you need is ‘(void *) malloc(long)’ if your requesting more than 2Gb.

One clue however is that I am also using free() because there is a bunch of memory bound in transient dynamic arrays. Could the problem be that free() is not freeing the memory in the case of the pgcc compiled code?

Doubtful. Both gcc and pgcc should be using the exact same libc system calls. The OS might not be reclaiming the memory, but it should be marked as ‘free’.

Some possibilities are: The pgcc version is making more calls to calloc, pgcc is missing the calls to free, or gcc is not making enough calls calloc. Without the code however, its very difficult to make a firm diagnosis. Can you determine how much memory your program actually requests?

  • Mat

Mea Culpa,

I had a configure issue. include malloc seems to be doing the trick and memory requirements are the same between codes now.

Thanks so much for your help!

Andy