Transformer : Throughput shown wrongly

Hi, i am running your transformer model code in tensorflow openseq2seq. The way the throughput is calculated in https://github.com/NVIDIA/OpenSeq2Seq/blob/master/open_seq2seq/utils/utils.py ; seems to be mis-leading. correct me if am wrong…

line 97 says the bench start is from 10th iteration but when you are calculating the avg objects per sec in line 226, 227, you are basically dividing total objects (iter 1 to end) by time taken only after 10th iter.

This makes the numerator too big and denominator too small for high batching.

Do you agree with this. I am getting ~10000 objects/sec for batch 256 which is unrealistic based on what i explained above.

Thanks

Hi,

Thanks a lot for the detailed report! This bug has been fixed in the latest master branch of OpenSeq2Seq on GitHub:
https://github.com/NVIDIA/OpenSeq2Seq. Please feel free to post any issues there.

Thanks,
Vitaly

Thanks Vitaly, Appreciate your response and fixing the code.