Issues Running Jupyter Notebook While setting up TAO

Please provide the following information when requesting support.

• Hardware (RTX 4060Ti)

Hi, I am having trouble while trying to get NVIDIA TAO working on my machine. I am currently following the steps posted here: https://docs.nvidia.com/tao/tao-toolkit/text/quick_start_guide/index.html

And then ended up at this link to get started as a beginner: https://docs.nvidia.com/tao/tao-toolkit/text/quick_start_guide/beginner.html#getting-started-as-a-beginner

I’ve been following these steps and am at the step to install the TAO launcher. I used bash to run quickstart_launcher.sh, both the install and upgrade.

I then get to the step of Running a Sample Tao Notebook. When I tried to run the command

jupyter notebook --ip 0.0.0.0 --port 8888 --allow-root

But got this output:

$ jupyter notebook --ip 0.0.0.0 --port 8888 --allow-root
Traceback (most recent call last):
  File "/home/porter/.local/bin/jupyter-notebook", line 5, in <module>
    from notebook.app import main
  File "/home/porter/.local/lib/python3.10/site-packages/notebook/app.py", line 12, in <module>
    from jupyter_server.base.handlers import JupyterHandler
ModuleNotFoundError: No module named 'jupyter_server'

I’ve tried restarting all my steps but ran into the exact same issue.
How should I resolve this? Has anyone else had this issue? I think I followed all steps exactly.

Thanks,
Andrew

Please try
$ sudo apt install jupyter-notebook

After

sudo apt install jupyter-notebook

And then running the jupyter command I still got the same output:

$ jupyter notebook --ip 0.0.0.0 --port 8888 --allow-root
Traceback (most recent call last):
  File "/home/porter/.local/bin/jupyter-notebook", line 5, in <module>
    from notebook.app import main
  File "/home/porter/.local/lib/python3.10/site-packages/notebook/app.py", line 12, in <module>
    from jupyter_server.base.handlers import JupyterHandler
ModuleNotFoundError: No module named 'jupyter_server'

Please try to

  1. Install jupyter_server:
pip install jupyter_server
  1. If the problem persists, try updating Jupyter Notebook:
pip install --upgrade jupyter notebook
  1. If you’re still experiencing issues, it might be due to a compatibility problem with the traitlets library. Downgrade traitlets to version 5.9.0:
pip uninstall traitlets
pip install traitlets==5.9.0
  1. If none of the above steps work, perform a clean reinstallation of Jupyter Notebook:
pip uninstall jupyter notebook
pip install jupyter notebook

Thanks Morgan. I ran:

pip install jupyter_server

outside my conda environment, then I was able to run jupyter notebook.

I did get this dependency error however, not sure if this will cause errors in the future.

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
jupyterlab-server 2.27.3 requires requests>=2.31, but you have requests 2.25.1 which is incompatible.

I found that I was able to use the Jupyter notebook, till I got to the stop to train the model. It looked like it was downloading something, but it then froze and the notebook crashed. I got this error:

[W 2025-03-05 09:49:03.824 ServerApp] WebSocket ping timeout after 116770 ms.
[W 2025-03-05 09:49:03.873 ServerApp] 
[W 2025-03-05 09:49:03.873 ServerApp] 
[W 2025-03-05 09:49:03.969 ServerApp] 
[W 2025-03-05 09:49:04.075 ServerApp] 
[W 2025-03-05 09:49:04.128 ServerApp] 
[W 2025-03-05 09:49:04.231 ServerApp] 
[W 2025-03-05 09:49:04.333 ServerApp] 
[W 2025-03-05 09:49:04.487 ServerApp] 
[W 2025-03-05 09:49:04.589 ServerApp] 

After that I shut down my Jupyter server and tried to re-open it. I got this output now:

jupyter notebook --ip 0.0.0.0 --port 8888 --allow-root
Traceback (most recent call last):
  File "/home/porter/.local/bin/jupyter-notebook", line 5, in <module>
    from notebook.app import main
  File "/home/porter/.local/lib/python3.10/site-packages/notebook/app.py", line 17, in <module>
    from jupyter_server.serverapp import flags
  File "/home/porter/.local/lib/python3.10/site-packages/jupyter_server/serverapp.py", line 108, in <module>
    from jupyter_server.gateway.managers import (
  File "/home/porter/.local/lib/python3.10/site-packages/jupyter_server/gateway/managers.py", line 16, in <module>
    import websocket
ModuleNotFoundError: No module named 'websocket'

So I then tried:

pip install jupyter_server

But I then get a dependency error:

Installing collected packages: websocket-client
  Attempting uninstall: websocket-client
    Found existing installation: websocket-client 0.57.0
    Uninstalling websocket-client-0.57.0:
      Successfully uninstalled websocket-client-0.57.0
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
nvidia-tao 5.5.1 requires websocket-client==0.57.0, but you have websocket-client 1.8.0 which is incompatible.
Successfully installed websocket-client-1.8.0

So I then downgrade to websocket-client 0.57.0, but thsi message comes up:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
jupyter-server 2.15.0 requires websocket-client>=1.7, but you have websocket-client 0.57.0 which is incompatible.

I then installed jupyter-server==2.4.0 and no warning popped up. However when I went to start the jupyter notebook I got the same issue I was having before where there was no module named ‘jupyter_server’

So I exited my conda environment to pip install jupyter_server as I had done before. However, this time I got this message:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
jupyterlab-server 2.27.3 requires requests>=2.31, but you have requests 2.25.1 which is incompatible.

So I then installed requests ==2.31. I was then able to run the jupyter notebook again.

I am wondering why this seemed to happen… perhaps when I ran:

!pip3 install nvidia-tao

in the jupyter notebook…

But once I got back to the training step, it looked like it was downloading for a while but then the tab running my FireFox crashed. I could see this in the output of the cell. Any idea why this popped up?

Traceback (most recent call last):
  File "/home/porter/miniconda3/envs/launcher/bin/tao", line 8, in <module>
    sys.exit(main())
  File "/home/porter/miniconda3/envs/launcher/lib/python3.10/site-packages/nvidia_tao_cli/entrypoint/tao_launcher.py", line 134, in main
    instance.launch_command(
  File "/home/porter/miniconda3/envs/launcher/lib/python3.10/site-packages/nvidia_tao_cli/components/instance_handler/local_instance.py", line 382, in launch_command
    docker_handler.run_container(command)
  File "/home/porter/miniconda3/envs/launcher/lib/python3.10/site-packages/nvidia_tao_cli/components/docker_handler/docker_handler.py", line 325, in run_container
    self.pull()
  File "/home/porter/miniconda3/envs/launcher/lib/python3.10/site-packages/nvidia_tao_cli/components/docker_handler/docker_handler.py", line 187, in pull
    docker_pull_progress(line, progress)
  File "/home/porter/miniconda3/envs/launcher/lib/python3.10/site-packages/nvidia_tao_cli/components/docker_handler/docker_handler.py", line 66, in docker_pull_progress
    TASKS[idx] = progress.add_task(f"{idx}", total=line['progressDetail']['total'])
KeyError: 'total'

Could you try to trigger notebook in a tao docker instead?
$docker run --runtime=nvidia -it --rm --ipc=host --gpus all --name webinar -d -v /localhome/local-xxx:/localhome/local-xxx -p 8888:8888 -v /var/run/docker.sock:/var/run/docker.sock -m 1100G --oom-kill-disable --ulimit memlock=-1 nvcr.io/nvidia/tao/tao-toolkit:5.5.0-pyt

apt-get install sudo

sudo apt-get update

sudo apt install docker.io

sudo ls -la /var/run/docker.sock

$docker login nvcr.io
Username: $oauthtoken
Password:

git clone GitHub - NVIDIA/tao_tutorials: Quick start scripts and tutorial notebooks to get started with TAO Toolkit

After modification, then trigger notebook.
root@851cd3d21645:/localhome/local-xxx/webinar# jupyter notebook --ip 0.0.0.0 --allow-root

Open browser, type:

http://127.0.0.1:8888

Then, enter the token.

Thanks for the reply. I’m sorry to admit, but I am a bit unfamiliar with exactly how docker works. I ran the exact command:

$docker run --runtime=nvidia -it --rm --ipc=host --gpus all --name webinar -d -v /localhome/local-xxx:/localhome/local-xxx -p 8888:8888 -v /var/run/docker.sock:/var/run/docker.sock -m 1100G --oom-kill-disable --ulimit memlock=-1 nvcr.io/nvidia/tao/tao-toolkit:5.5.0-pyt

And got this output:

Digest: sha256:d0d24bc5608832246ed6f7f768b8dbbe429e0e41c580582a0b89606bb9e752a9
Status: Downloaded newer image for nvcr.io/nvidia/tao/tao-toolkit:5.5.0-pyt
WARNING: Your kernel does not support OomKillDisable. OomKillDisable discarded.
13b0fd6df8e578f46d8ff8bbf7719cbc63f03331c1488812857e1f01fab68d6b

I then ran:

apt-get install sudo

And got this output:

E: Could not open lock file /var/lib/dpkg/lock-frontend - open (13: Permission denied)
E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), are you root?

I then ran these 2 commands:

sudo apt-get update
sudo apt install docker.io

And got the following output with installing docker.io:

Building dependency tree... Done
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help resolve the situation:

The following packages have unmet dependencies:
 containerd.io : Conflicts: containerd
E: Error, pkgProblemResolver::Resolve generated breaks, this may be caused by held packages.

I then ran the command and got this output:

sudo ls -la /var/run/docker.sock
srw-rw---- 1 root docker 0 Mar  6 09:57 /var/run/docker.sock

I was able to successfully login to docker, and then clone the github repository. But then how do I actually run the container? Really sorry for my lack of understanding of how to run this. I am still quite new to Linux.

Andrew

Hi, Please after
$docker run --runtime=nvidia -it --rm --ipc=host --gpus all --name webinar -d -v /localhome/local-xxx:/localhome/local-xxx -p 8888:8888 -v /var/run/docker.sock:/var/run/docker.sock -m 1100G --oom-kill-disable --ulimit memlock=-1 nvcr.io/nvidia/tao/tao-toolkit:5.5.0-pyt

Then, run below in order to run inside the docker.
$ docker exec -it webinar /bin/bash

Thanks Morgan,

It looks like TAO is running properly now. Just for my own information, and in case anyone else is reading this thread, there were a few additional steps I did.

First, when running performing the $docker run command, I mapped another volume using -v. This /home/porter/test location is where I saved my data and cloned the github repository to. I also renamed the container from webinar to taotest (this wasn’t necessary, I just did this for my own naming convention). So the command I ran was:

$ docker run --runtime=nvidia -it --rm --ipc=host --gpus all --name taotest -d -v /localhome/local-xxx:/localhome/local-xxx -p 8888:8888 -v /var/run/docker.sock:/var/run/docker.sock -v /home/porter/test:/home/porter/test -m 1100G --oom-kill-disable --ulimit memlock=-1 nvcr.io/nvidia/tao/tao-toolkit:5.5.0-pyt

After I ran:

$ docker exec -it taotest /bin/bash

Once I was in the container, I ran the other commands Morgan specified:

apt-get install sudo

sudo apt-get update

sudo apt install docker.io

sudo ls -la /var/run/docker.sock

$docker login nvcr.io
Username: $oauthtoken
Password: (specifying your own password)

I then navigated to my mapped volume of /home/porter/test, and cloned the github repository:

cd
cd /home/porter/test
git clone https://github.com/NVIDIA/tao_tutorials

I then triggered the notebook by running:

jupyter notebook --ip 0.0.0.0 --allow-root

Then opening the browser at http://127.0.0.1:8888

Then enter the token generated in your terminal:

http://hostname:8888/?token=a6e78bada2a5i203e5b63xaad5554fe39bxe9382f23b329a

Hope this helps anyone else who’s running into this issue!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.