As is well known, the default size of shared memory in GTX480 is 48KB, with L1 cache 16K. But my app running on GTX480 with more than 16KB shared memory used causes errors like this :
uses too much shared data (0xa09c bytes + 0x10 bytes system, 0x4000 max)
The max value is 0x4000, just 16KB… Then I use CUDA Runtime Function–cudaFuncSetCacheConfig to force the shared memory to be 48KB for my kernel function. But of no use…
Is there anyone have encountered the same problem ?
Thanks… ^_^
As is well known, the default size of shared memory in GTX480 is 48KB, with L1 cache 16K. But my app running on GTX480 with more than 16KB shared memory used causes errors like this :
uses too much shared data (0xa09c bytes + 0x10 bytes system, 0x4000 max)
The max value is 0x4000, just 16KB… Then I use CUDA Runtime Function–cudaFuncSetCacheConfig to force the shared memory to be 48KB for my kernel function. But of no use…
Is there anyone have encountered the same problem ?
Thanks… ^_^
As a first guess, try compiling with -arch=sm_20.
I’ve tried this, but it’s of no use… Thanks ^_^
Lev
May 4, 2010, 12:42pm
4
Btw, was shared memory size per block increased? Or just total amout?
The shared memory per multiprocessor increases to 48KB, so we can use more in our block.
As is well known, the default size of shared memory in GTX480 is 48KB, with L1 cache 16K. But my app running on GTX480 with more than 16KB shared memory used causes errors like this :
uses too much shared data (0xa09c bytes + 0x10 bytes system, 0x4000 max)
The max value is 0x4000, just 16KB… Then I use CUDA Runtime Function–cudaFuncSetCacheConfig to force the shared memory to be 48KB for my kernel function. But of no use…
Is there anyone have encountered the same problem ?
Thanks… ^_^
The correct way is to modify the CUFILES to CUFILES_sm_20 in your own Makefile…
I have a question. Are you running the 480 GTX as a X display as well? If so, stop the X server and try your cache preference again.
Hi I have exactly the same problem with you. How did you deal with it finally?
Thanks
As is well known, the default size of shared memory in GTX480 is 48KB, with L1 cache 16K. But my app running on GTX480 with more than 16KB shared memory used causes errors like this :
uses too much shared data (0xa09c bytes + 0x10 bytes system, 0x4000 max)
The max value is 0x4000, just 16KB… Then I use CUDA Runtime Function–cudaFuncSetCacheConfig to force the shared memory to be 48KB for my kernel function. But of no use…
Is there anyone have encountered the same problem ?
Thanks… ^_^
Hi I have exactly the same problem with you. How did you deal with it finally?
Thanks
As is well known, the default size of shared memory in GTX480 is 48KB, with L1 cache 16K. But my app running on GTX480 with more than 16KB shared memory used causes errors like this :
uses too much shared data (0xa09c bytes + 0x10 bytes system, 0x4000 max)
The max value is 0x4000, just 16KB… Then I use CUDA Runtime Function–cudaFuncSetCacheConfig to force the shared memory to be 48KB for my kernel function. But of no use…
Is there anyone have encountered the same problem ?
Thanks… ^_^
You need to compile with the flag -arch=sm_20
You need to compile with the flag -arch=sm_20
Thanks for the reply. But could you please give me some specific hints about how to change this flag in the <common.mk> file?
Thanks
Thanks for the reply. But could you please give me some specific hints about how to change this flag in the <common.mk> file?
Thanks
Yes, do not use the common.mk…
Write your own makefile, it is very simple and you have full control.
Yes, do not use the common.mk…
Write your own makefile, it is very simple and you have full control.
Hi mfatica,
I wrote a simple makefile as follow(not sure whether it’s correct). I did set the flag to -arch=sm_20 option, but when make, the problem is still there, showing that ptxas error : Entry function ‘_Z6kernelPcP5bit_tii’ uses too much shared data (0x8018 bytes + 0x10 bytes system, 0x4000 max)
NVCCFLAGS := -O3 -arch=sm_20
NVCC := /usr/local/cuda/bin/nvcc
LD_LIBRARY_PATH := /usr/local/cuda/lib64
all: main.cu Makefile
$(NVCC) -o creatures main.cu $(NVCCFLAGS)
clean:
rm -rf creatures
run: all
./creatures
include …/…/common/common.mk
Thanks
Yes, do not use the common.mk…
Write your own makefile, it is very simple and you have full control.
Hi mfatica,
I wrote a simple makefile as follow(not sure whether it’s correct). I did set the flag to -arch=sm_20 option, but when make, the problem is still there, showing that ptxas error : Entry function ‘_Z6kernelPcP5bit_tii’ uses too much shared data (0x8018 bytes + 0x10 bytes system, 0x4000 max)
NVCCFLAGS := -O3 -arch=sm_20
NVCC := /usr/local/cuda/bin/nvcc
LD_LIBRARY_PATH := /usr/local/cuda/lib64
all: main.cu Makefile
$(NVCC) -o creatures main.cu $(NVCCFLAGS)
clean:
rm -rf creatures
run: all
./creatures
include …/…/common/common.mk
Thanks
Yes, do not use the common.mk…
Write your own makefile, it is very simple and you have full control.
mfatica
August 11, 2010, 10:33pm
20
Try this NVCCFLAGS:
-O3 --ptxas-options=-v -arch sm_20
Do not include the common.mk, you are probably overriding the flags you just set.