Tlt -dgx

  • How does TLT estimate ETA? Does it use any regression model?
  • Does TLT use Tensor cores?
  • Better documentation for TLT converter

Moving to TLT forum so that TLT team can take a look.

Thanks

Hi Ravik,

  1. Do you mean the training ETA? Could you elaborate the “regression model”?
  2. Current training in TLT, tensor cores are not needed.
  3. See tlt user guide https://docs.nvidia.com/metropolis/TLT/tlt-getting-started-guide/index.html#gen_eng_tlt_converter

@Morganh

ETA: “Estimated Time of Arrival”/ “Estimated Completion Time” to calculate the computational process in general.
how do we calculate ETA in TLT?

For tensor cores, please if you can enable this in next release, since our infrastructure has DGX systems.

The “ETA” is calculated by remaining epochs and the cost time of last epoch.
For tensor cores, if one gpu supports it, it should be available and can be enabled if needed. But current training jobs in TLT2.0_dp docker does not need the tensor cores. Next release, some training job for new feature will need the tensor cores.