NGC SpeechSynthesis(Tacotron2) example's expected training time is not clear in the documentation

Hi,

I am following NGC SpeechSynthesis(Tacotron2) example on the github at this link:

Here is the relevant text:

[i]Expected training time
This table shows the expected training time for convergence for Tacotron 2 (1500 epochs).

Number of GPUs Expected training time with mixed precision Expected training time with FP32 Speed-up with mixed precision
1 208.00 288.03 1.38
4 67.53 84.20 1.25
8 33.14 44.00 1.33
This table shows the expected training time for convergence for WaveGlow (1000 epochs).

Number of GPUs Expected training time with mixed precision Expected training time with FP32 Speed-up with mixed precision
1 437.03 814.30 1.86
4 108.26 223.04 2.06
8 54.83 109.96 2.01[/i]

and I am unable to understand expected training time in this documentation.

Could someone please explain if the unit of these numbers is hours or minutes?

Thanks in advance.