We just received an S1070 and I am porting C++ simulation code to Cuda and in the process of trying to establish strategies for the use of various kinds of memory on the device so as to optimize performance. I have asked NVidia several times for the complete technical specs on this device with no success. Here’s what I know.
The S1070 is essentially 4 x C1060, meaning for each card it has 4GB SDRAM, 64k x 32bit registers x 30 SM, 16 x 1KB shared memory x 30 SM. No mention is made in the documentation for the size of the caches, nor the constant or texture memory spaces. I know that constant memory is cached, so it clearly is memory wired separate from SDRAM, but how much is configured? Texture memory access is faster than global, so it could be created from SDRAM with better mapping or it could be separate memory. Which is it? Either way, I need to know the limitations that apply to all the types of memory on this device in order to handle this port properly.
I think these are very reasonable questions and this information should have been included in the specs for the device. Would one of the NVidia engineers, or any forum member, who has access to this information please share it with me.