GDS problem

I would like to ask:

  1. How can I change the granularity of cufile read (GDS) at runtime in CUDA code? And from experience, does increasing granularity significantly help throughput?
  2. Is there a core GDS developer who can assist me in reproducing the following image? I know I need to use gdsio, but I can’t find parameters to change the block size as shown on the x-axis of the image