CoalescedMemTest.zip (1.63 MB)
I am trying to measure the performance between coalesced access pattern and uncoalesced access pattern.
So, I have written a project for this purpuse.
I think that I follow the guideline for coalesced memory access pattern from the CUDA programming reference guide.
However, I can’t figure out meaningful difference between them.
I will attach my project.
Please look into my project, let me know that my fault or my mistake.
I expect your help.