how to use 48k-shared memory in gtx480?

joeyyeoj · May 4, 2010, 6:01am

As is well known, the default size of shared memory in GTX480 is 48KB, with L1 cache 16K. But my app running on GTX480 with more than 16KB shared memory used causes errors like this :

uses too much shared data (0xa09c bytes + 0x10 bytes system, 0x4000 max)

The max value is 0x4000, just 16KB… Then I use CUDA Runtime Function–cudaFuncSetCacheConfig to force the shared memory to be 48KB for my kernel function. But of no use…

Is there anyone have encountered the same problem ?

Thanks… ^_^

SPWorley · May 4, 2010, 6:39am

As is well known, the default size of shared memory in GTX480 is 48KB, with L1 cache 16K. But my app running on GTX480 with more than 16KB shared memory used causes errors like this :
uses too much shared data (0xa09c bytes + 0x10 bytes system, 0x4000 max)
The max value is 0x4000, just 16KB… Then I use CUDA Runtime Function–cudaFuncSetCacheConfig to force the shared memory to be 48KB for my kernel function. But of no use…

Is there anyone have encountered the same problem ?

Thanks… ^_^

As a first guess, try compiling with -arch=sm_20.

joeyyeoj · May 4, 2010, 7:16am

I’ve tried this, but it’s of no use… Thanks ^_^

Lev · May 4, 2010, 12:42pm

Btw, was shared memory size per block increased? Or just total amout?

joeyyeoj · May 6, 2010, 3:05am

The shared memory per multiprocessor increases to 48KB, so we can use more in our block.

joeyyeoj · May 7, 2010, 2:33am

As is well known, the default size of shared memory in GTX480 is 48KB, with L1 cache 16K. But my app running on GTX480 with more than 16KB shared memory used causes errors like this :
uses too much shared data (0xa09c bytes + 0x10 bytes system, 0x4000 max)
The max value is 0x4000, just 16KB… Then I use CUDA Runtime Function–cudaFuncSetCacheConfig to force the shared memory to be 48KB for my kernel function. But of no use…

Is there anyone have encountered the same problem ?

Thanks… ^_^

The correct way is to modify the CUFILES to CUFILES_sm_20 in your own Makefile…

indy2718 · May 7, 2010, 4:50pm

I have a question. Are you running the 480 GTX as a X display as well? If so, stop the X server and try your cache preference again.

bit_mapper · August 11, 2010, 6:25pm

Hi I have exactly the same problem with you. How did you deal with it finally?

Thanks

As is well known, the default size of shared memory in GTX480 is 48KB, with L1 cache 16K. But my app running on GTX480 with more than 16KB shared memory used causes errors like this :
uses too much shared data (0xa09c bytes + 0x10 bytes system, 0x4000 max)
The max value is 0x4000, just 16KB… Then I use CUDA Runtime Function–cudaFuncSetCacheConfig to force the shared memory to be 48KB for my kernel function. But of no use…

Is there anyone have encountered the same problem ?

Thanks… ^_^

bit_mapper · August 11, 2010, 6:25pm

Hi I have exactly the same problem with you. How did you deal with it finally?

Thanks

As is well known, the default size of shared memory in GTX480 is 48KB, with L1 cache 16K. But my app running on GTX480 with more than 16KB shared memory used causes errors like this :
uses too much shared data (0xa09c bytes + 0x10 bytes system, 0x4000 max)
The max value is 0x4000, just 16KB… Then I use CUDA Runtime Function–cudaFuncSetCacheConfig to force the shared memory to be 48KB for my kernel function. But of no use…

Is there anyone have encountered the same problem ?

Thanks… ^_^

bit_mapper · August 11, 2010, 6:35pm

As is well known, the default size of shared memory in GTX480 is 48KB, with L1 cache 16K. But my app running on GTX480 with more than 16KB shared memory used causes errors like this :
uses too much shared data (0xa09c bytes + 0x10 bytes system, 0x4000 max)
The max value is 0x4000, just 16KB… Then I use CUDA Runtime Function–cudaFuncSetCacheConfig to force the shared memory to be 48KB for my kernel function. But of no use…

Is there anyone have encountered the same problem ?

Thanks… ^_^

bit_mapper · August 11, 2010, 6:35pm

As is well known, the default size of shared memory in GTX480 is 48KB, with L1 cache 16K. But my app running on GTX480 with more than 16KB shared memory used causes errors like this :
uses too much shared data (0xa09c bytes + 0x10 bytes system, 0x4000 max)
The max value is 0x4000, just 16KB… Then I use CUDA Runtime Function–cudaFuncSetCacheConfig to force the shared memory to be 48KB for my kernel function. But of no use…

Is there anyone have encountered the same problem ?

Thanks… ^_^

mfatica · August 11, 2010, 8:53pm

You need to compile with the flag -arch=sm_20

mfatica · August 11, 2010, 8:53pm

You need to compile with the flag -arch=sm_20

bit_mapper · August 11, 2010, 9:07pm

Thanks for the reply. But could you please give me some specific hints about how to change this flag in the <common.mk> file?

Thanks

bit_mapper · August 11, 2010, 9:07pm

Thanks for the reply. But could you please give me some specific hints about how to change this flag in the <common.mk> file?

Thanks

mfatica · August 11, 2010, 9:13pm

Yes, do not use the common.mk…

Write your own makefile, it is very simple and you have full control.

mfatica · August 11, 2010, 9:13pm

Yes, do not use the common.mk…

Write your own makefile, it is very simple and you have full control.

bit_mapper · August 11, 2010, 10:26pm

Hi mfatica,

I wrote a simple makefile as follow(not sure whether it’s correct). I did set the flag to -arch=sm_20 option, but when make, the problem is still there, showing that ptxas error : Entry function ‘_Z6kernelPcP5bit_tii’ uses too much shared data (0x8018 bytes + 0x10 bytes system, 0x4000 max)

NVCCFLAGS := -O3 -arch=sm_20

NVCC := /usr/local/cuda/bin/nvcc

LD_LIBRARY_PATH := /usr/local/cuda/lib64

all: main.cu Makefile

    $(NVCC) -o creatures main.cu $(NVCCFLAGS)

clean:

    rm -rf creatures

run: all

    ./creatures

include …/…/common/common.mk

Thanks

bit_mapper · August 11, 2010, 10:26pm

Hi mfatica,

I wrote a simple makefile as follow(not sure whether it’s correct). I did set the flag to -arch=sm_20 option, but when make, the problem is still there, showing that ptxas error : Entry function ‘_Z6kernelPcP5bit_tii’ uses too much shared data (0x8018 bytes + 0x10 bytes system, 0x4000 max)

NVCCFLAGS := -O3 -arch=sm_20

NVCC := /usr/local/cuda/bin/nvcc

LD_LIBRARY_PATH := /usr/local/cuda/lib64

all: main.cu Makefile

    $(NVCC) -o creatures main.cu $(NVCCFLAGS)

clean:

    rm -rf creatures

run: all

    ./creatures

include …/…/common/common.mk

Thanks

mfatica · August 11, 2010, 10:33pm

Try this NVCCFLAGS:

-O3 --ptxas-options=-v -arch sm_20

Do not include the common.mk, you are probably overriding the flags you just set.

Topic		Replies	Views
using too much shared memory error CUDA Programming and Performance	12	17095	August 11, 2010
how to use 48k shared memory in GTX 480 CUDA Programming and Performance	1	593	January 4, 2011
Only 16Kbyte shared memory on GTX480 CUDA Programming and Performance	4	1162	September 15, 2010
Shared Mem size and Shared vs L1 Cache memory CUDA Programming and Performance	9	2147	November 17, 2010
What is the default Shared Memory size per block in GTX 480 ? CUDA Programming and Performance	6	3445	March 21, 2012
GeForce GTS 450 - max shared memory CUDA Programming and Performance	2	1327	August 19, 2011
run out of shared memory? using too much shared data CUDA Programming and Performance	2	20492	August 30, 2011
issue using cudaFuncSetCacheConfig setting cudaFuncSetCacheConfig(MyKernel, cudaFuncCachePreferShare CUDA Programming and Performance	1	970	November 16, 2010
Fermi Question: How do I tell the nvcc compiler that I am using 48 K of shared memory? CUDA Programming and Performance	4	1237	June 26, 2010
Fermi Shared Memory CUDA Programming and Performance	2	4034	July 22, 2010

how to use 48k-shared memory in gtx480?

Related topics