Gds tools gdsio ,the Throughput is less then 500M

I use gdsio test GPU Storage performance. The underlying SSD is Samsung 970 pro and GPU is Tesla10 . I performed seq read test and transferred 4K from the file whose size is 4GB to the GPU GDS buffer. why the Throughput is less then 500M whatever the threadnum i used.
How can i improve the performance?

The test is below:

  1. ./gdsio -x 6 -f /mnt/gdsio.001 -d 0 -w 8 -s 5G -i 4k -I 1

IoType: WRITE XferType: GPU_BATCH Threads: 1 IoDepth: 8 DataSetSize: 5242880/5242880(KiB) IOSize: 4(KiB) Throughput: 0.363297 GiB/sec, Avg_Latency: 84.000275 usecs ops: 163840 total_time 13.762840 secs

2)./gdsio -x 6 -f /mnt/gdsio.001 -d 0 -w 4 -s 5G -i 4k -I 1

IoType: WRITE XferType: GPU_BATCH Threads: 1 IoDepth: 4 DataSetSize: 5242880/5242880(KiB) IOSize: 4(KiB) Throughput: 0.241293 GiB/sec, Avg_Latency: 63.237042 usecs ops: 327680 total_time 20.721715 secs

  1. ./gdsio -x 6 -f /mnt/gdsio.001 -d 0 -w 16 -s 5G -i 4k -I 1

IoType: WRITE XferType: GPU_BATCH Threads: 1 IoDepth: 16 DataSetSize: 5242880/5242880(KiB) IOSize: 4(KiB) Throughput: 0.435547 GiB/sec, Avg_Latency: 140.130884 usecs ops: 81920 total_time 11.479828 secs

4)./gdsio -x 6 -f /mnt/gdsio.001 -d 0 -w 32 -s 5G -i 4k -I 1

IoType: WRITE XferType: GPU_BATCH Threads: 1 IoDepth: 32 DataSetSize: 5242880/5242880(KiB) IOSize: 4(KiB) Throughput: 0.473013 GiB/sec, Avg_Latency: 258.058984 usecs ops: 40960 total_time 10.570538 secs

  1. ./gdsio -x 6 -f /mnt/gdsio.001 -d 0 -w 64 -s 5G -i 4k -I 1

IoType: WRITE XferType: GPU_BATCH Threads: 1 IoDepth: 64 DataSetSize: 5242880/5242880(KiB) IOSize: 4(KiB) Throughput: 0.460453 GiB/sec, Avg_Latency: 530.181250 usecs ops: 20480 total_time 10.858860 secs

  1. ./gdsio -x 6 -f /mnt/gdsio.001 -d 0 -w 100 -s 10G -i 4k -I 1

INFO: Truncated down the data size to lower size which is a multiple of batch sizes * no of batches 10737254400

IoType: WRITE XferType: GPU_BATCH Threads: 1 IoDepth: 100 DataSetSize: 10485600/10485760(KiB) IOSize: 4(KiB) Throughput: 0.486352 GiB/sec, Avg_Latency: 784.306744 usecs ops: 26214 total_time 20.560912 secs

  1. ./gdsio -x 6 -f /mnt/gdsio.001 -d 0 -w 128 -s 10G -i 4k -I 1

IoType: WRITE XferType: GPU_BATCH Threads: 1 IoDepth: 128 DataSetSize: 10485760/10485760(KiB) IOSize: 4(KiB) Throughput: 0.487373 GiB/sec, Avg_Latency: 1001.797021 usecs ops: 20480 total_time 20.518156 secs

You have new mail in /var/spool/mail/root

8)./gdsio -x 6 -f /mnt/gdsio.001 -d 0 -w 128 -s 5G -i 4k -I 1

IoType: WRITE XferType: GPU_BATCH Threads: 1 IoDepth: 128 DataSetSize: 5242880/5242880(KiB) IOSize: 4(KiB) Throughput: 0.477510 GiB/sec, Avg_Latency: 1022.418457 usecs ops: 10240 total_time 10.470994 secs