egg
August 3, 2010, 1:46pm
1
Does anyone test the convolution Separable example in CUDA SDK using New Architecture GPU of Fermi?
I’m confused about those data:
Image size: 3072x3072, kernel size: 17x17
9800 GT:
1929.0962 MPixels/sec, Time = 0.00489 s, Size = 9437184 Pixels
GTX 460:
1769.5650 MPixels/sec, Time = 0.00533 s, Size = 9437184 Pixels
And the convolution example using Texture:
9800 GT:
convolutionRowsGPU: 2.430496 ms.
convolutionColumnsGPU: 2.465828 ms
GTX 460:
convolutionRowsGPU: 4.105941 ms.
convolutionColumnsGPU: 4.104741 ms
I don’t know why GTX460 is slower than 9800 GT, and using texture is slower about twice, WHY???
P.S. My driver version is: 258.96; and CUDA toolkit version is 3.1, and CUDA SDK version is 3.10.608.1117
Textures?
–edit–
O yeah, Textures in fermi are slow… or could be slow depending on things… Experts will b able to advice you on that.
Can you elaborate on that? I’m aware that textures are somewhat less important on Fermi because of the new L1 and L2 caches. I also know that GF100 has a pretty low texture fill rate for the GFLOPs available but GF104 is a somewhat different beast. The GTX 460 has a texture fill rate somewhere between a GTX 470 and a GTX 480 and I have test code that actually hits that figure. However, I’ve also observed that small changes to my test code cause the performance to halve for no particular reason. I’d be very interested if anyone has an explanation. Personally I’m suspicious that the GTX 460 is (for the time being at least) being hobbled in CUDA.
egg
August 4, 2010, 6:41am
4
Do you give me more information about Texture Characters about Fermi?
I get from wiki, however, the GTX460 is faster than 9800GT about peak fillrate.
9800GT: 33.6G texel/s
GTX460: 37.8G texel/s
egg
August 4, 2010, 6:44am
5
Can you elaborate on that? I’m aware that textures are somewhat less important on Fermi because of the new L1 and L2 caches. I also know that GF100 has a pretty low texture fill rate for the GFLOPs available but GF104 is a somewhat different beast. The GTX 460 has a texture fill rate somewhere between a GTX 470 and a GTX 480 and I have test code that actually hits that figure. However, I’ve also observed that small changes to my test code cause the performance to halve for no particular reason. I’d be very interested if anyone has an explanation. Personally I’m suspicious that the GTX 460 is (for the time being at least) being hobbled in CUDA.
I got from wiki, however, the GTX460 is faster than 9800GT about peak fillrate.
9800GT: 33.6G texel/s
GTX460: 37.8G texel/s
I am interested in texture in Fermi, if you have any information about these, please tell me.
I got from wiki, however, the GTX460 is faster than 9800GT about peak fillrate.
9800GT: 33.6G texel/s
GTX460: 37.8G texel/s
I am interested in texture in Fermi, if you have any information about these, please tell me.
My results are in this thread: http://forums.nvidia.com/index.php?showtopic=174825
I’ve tried to get some kind of a response from nVidia but since the GTX 460 isn’t a Tesla they aren’t interested.
iceberg
February 22, 2011, 8:30am
7
About the texturing performance issue, is there any further informantion? Could any nvidia staff give us an explanation?