Tensorboard is not working while running Dino training

Please provide the following information when requesting support.

• Hardware (T4)
• Network Type (Dino/Yolov3)
• Training spec file(If have, please share here)

• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

When i run the below tensorboard command while my Dino training using TAO launcher CLI is going on i got some errors

(launcher) (venvs) rishi@rishi-workspace:~/tao_training/training_home/Dino/dino$ tensorboard --logdir results/ --host 0.0.0.0 --port 8080
TensorFlow installation not found - running with reduced feature set.

NOTE: Using experimental fast data loading logic. To disable, pass
    "--load_fast=false" and report issues on GitHub. More details:
    https://github.com/tensorflow/tensorboard/issues/4784

TensorBoard 2.14.0 at http://0.0.0.0:8080/ (Press CTRL+C to quit)
E0523 06:33:17.119219 139673577961216 _internal.py:97] Error on request:
Traceback (most recent call last):
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/werkzeug/serving.py", line 363, in run_wsgi
    execute(self.server.app)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/werkzeug/serving.py", line 324, in execute
    application_iter = app(environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/application.py", line 528, in __call__
    return self._app(environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/application.py", line 569, in wrapper
    return wsgi_app(environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/security_validator.py", line 91, in __call__
    return self._application(environ, start_response_proxy)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/path_prefix.py", line 68, in __call__
    return self._application(environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/experiment_id.py", line 73, in __call__
    return self._application(environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/empty_path_redirect.py", line 43, in __call__
    return self._application(environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/client_feature_flags.py", line 55, in __call__
    return self._application(environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/auth_context_middleware.py", line 38, in __call__
    return self._application(environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/application.py", line 551, in _route_request
    return self.exact_routes[clean_path](environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/werkzeug/wrappers/request.py", line 190, in application
    resp = f(*args[:-2] + (request,))
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/plugins/hparams/hparams_plugin.py", line 122, in get_experiment_route
    json_format.MessageToJson(
TypeError: MessageToJson() got an unexpected keyword argument 'including_default_value_fields'

In the browser also i see the message "Failed to fetch " and disappear

Can you advice why this is occuring and how to monitor the training of Dino with tensorboard

Please try to copy the result folder to other place and rerun.

Hi

Even if changed the folder to somewhere else and run as it is in the picture i get error same as before

Hi any advice one this , do i need Tensorflow installation , as it is suggesting ?

Could you try --logdir_spec?
Refer to TAO v3.22.05 Toolkit visualization tensorboard not running with 2 results directories - #2 by Morganh.
BTW, there is also a sharing in How to use Tensorboard with the Latest TAO Toolkit - #2 by Morganh for tensorboard.

Hi

Yeah that worked and i can see the weights values in the graph in the browser , that seems fine

But still i get this error in the terminal

(launcher) (venvs) rishi@rishi-workspace:~/tao_training/training_home/Dino$ tensorboard --logdir_spec=dino_test:/home/rishi/tao_training/training_home/Dino
/dino/results/train/ --host 0.0.0.0 --port 8080
TensorFlow installation not found - running with reduced feature set.
TensorBoard 2.14.0 at http://0.0.0.0:8080/ (Press CTRL+C to quit)
E0527 07:46:00.783579 140548493506304 _internal.py:97] Error on request:
Traceback (most recent call last):
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/werkzeug/serving.py", line 363, in run_wsgi
    execute(self.server.app)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/werkzeug/serving.py", line 324, in execute
    application_iter = app(environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/application.py", line 528, in __call__
    return self._app(environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/application.py", line 569, in wrapper
    return wsgi_app(environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/security_validator.py", line 91, in __call__
    return self._application(environ, start_response_proxy)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/path_prefix.py", line 68, in __call__
    return self._application(environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/experiment_id.py", line 73, in __call__
    return self._application(environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/empty_path_redirect.py", line 43, in __call__
    return self._application(environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/client_feature_flags.py", line 55, in __call__
    return self._application(environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/auth_context_middleware.py", line 38, in __call__
    return self._application(environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/backend/application.py", line 551, in _route_request
    return self.exact_routes[clean_path](environ, start_response)
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/werkzeug/wrappers/request.py", line 190, in application
    resp = f(*args[:-2] + (request,))
  File "/home/rishi/miniconda3/envs/launcher/lib/python3.8/site-packages/tensorboard/plugins/hparams/hparams_plugin.py", line 122, in get_experiment_route
    json_format.MessageToJson(
TypeError: MessageToJson() got an unexpected keyword argument 'including_default_value_fields'

So, do you mean all is fine in tensorboard visualization?

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.