I would like to ask:
- How can I change the granularity of cufile read (GDS) at runtime in CUDA code? And from experience, does increasing granularity significantly help throughput?
- Is there a core GDS developer who can assist me in reproducing the following image? I know I need to use gdsio, but I can’t find parameters to change the block size as shown on the x-axis of the image