Host memory eaten up during compile time

Hi All,

When I compile my code, nvcc hangs with a warning message:
/tmp/tmpxft_00002d59_00000000-7_TracyGPmain.cpp3.i(0): Warning: Olimit was exceeded on function _Z11test_kernelP4RingPf; will not perform function-scope optimization.
To still perform function-scope optimization, use -OPT:Olimit=0 (no limit) or -OPT:Olimit=222003

And the memory on my host will be slowly filled up to 99% and then the swap space, so that I had to reboot to clear it up. Here is the screen dump:

[kaisong@supermicro-0-30-48-fd-f7-5c ksong_GPU]$ nvcc TracyGPmain.cu *.cpp
/tmp/tmpxft_00002d59_00000000-7_TracyGPmain.cpp3.i(0): Warning: Olimit was exceeded on function _Z11test_kernelP4RingPf; will not perform function-scope optimization.
To still perform function-scope optimization, use -OPT:Olimit=0 (no limit) or -OPT:Olimit=222003

I looked it up this warning message. It seems like the kernel is out of registers? But I have no idea why it fills up the host memory on compiling time. Am I missing something simple? Any help will be really appreciated. And let me know you need further information about my code.

Here is the deviceQuery for my GPU:

Device 1: “Tesla C2050”
CUDA Driver Version: 3.10
CUDA Runtime Version: 3.0
CUDA Capability Major revision number: 2
CUDA Capability Minor revision number: 0
Total amount of global memory: 2817982464 bytes
Number of multiprocessors: 14
Number of cores: 448
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Clock rate: 1.15 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: No
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default (multiple host threads can use this device simultaneously)

Thanks in advance,

Kai

Hi All,

When I compile my code, nvcc hangs with a warning message:
/tmp/tmpxft_00002d59_00000000-7_TracyGPmain.cpp3.i(0): Warning: Olimit was exceeded on function _Z11test_kernelP4RingPf; will not perform function-scope optimization.
To still perform function-scope optimization, use -OPT:Olimit=0 (no limit) or -OPT:Olimit=222003

And the memory on my host will be slowly filled up to 99% and then the swap space, so that I had to reboot to clear it up. Here is the screen dump:

[kaisong@supermicro-0-30-48-fd-f7-5c ksong_GPU]$ nvcc TracyGPmain.cu *.cpp
/tmp/tmpxft_00002d59_00000000-7_TracyGPmain.cpp3.i(0): Warning: Olimit was exceeded on function _Z11test_kernelP4RingPf; will not perform function-scope optimization.
To still perform function-scope optimization, use -OPT:Olimit=0 (no limit) or -OPT:Olimit=222003

I looked it up this warning message. It seems like the kernel is out of registers? But I have no idea why it fills up the host memory on compiling time. Am I missing something simple? Any help will be really appreciated. And let me know you need further information about my code.

Here is the deviceQuery for my GPU:

Device 1: “Tesla C2050”
CUDA Driver Version: 3.10
CUDA Runtime Version: 3.0
CUDA Capability Major revision number: 2
CUDA Capability Minor revision number: 0
Total amount of global memory: 2817982464 bytes
Number of multiprocessors: 14
Number of cores: 448
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Clock rate: 1.15 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: No
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default (multiple host threads can use this device simultaneously)

Thanks in advance,

Kai