I am trying to install TAO on a windows machine in order to do some benchmarks.
We are a small NSF funded company who has a way of accelerating transfer learning and want to demonstrate it in several network models supported by TAO.
We had some great success with TensorFlow but are struggling with the TAO toolkit.
It installed fine in python, (eg tao info --verbose gives the right output), but we are not able to get the docker container environment to work (tao detectnet_v2 gives errors: “Docker CLI hasn’t been logged in to a registry.”).
It seems all of the instructions are for linux. https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker
Being able to install on a variety of machines will help grow the community to include those who may be great with algorithms, but less so with installations.
We are running two Windows 10 machines each with multicore CPU’s and GPU’s (Titan X & 1660 Ti).
Thank you!
Thank you Morganh.
Does that mean all of the previous python installations, user environments and everything needs to be reinstalled? This is a blank virtual machine where transfering existing code, drives and data becomes an installation issue?
Thank you for your help. I am not good at installation unless it is very specific, and from what I understood the docker containers were meant to simplify, but it seems to be doing the opposite.
I have docker running on windows running ubuntu. Thein I am also supposed to install docker in ubuntu?
Docker-CE on Ubuntu can be setup using Docker’s official convenience script:
For your current environment, you mention that you already trigger a ubuntu docker and run inside it, right? Can you run “$ docker info” in this docker?
In Windows, I still suggest you to install WSL firstly or something others such as Linux virtual machine.
WARNING: No blkio throttle.read_bps_device support
WARNING: No blkio throttle.write_bps_device support
WARNING: No blkio throttle.read_iops_device support
WARNING: No blkio throttle.write_iops_device support
From inside the container: which is based on WSL
root@56634ca0b771:/# docker info
Client:
Context: default
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Docker Buildx (Docker Inc., v0.9.1-docker)
compose: Docker Compose (Docker Inc., v2.10.2)
scan: Docker Scan (Docker Inc., v0.17.0)
Server:
ERROR: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
errors pretty printing info
It seems like there is a big hole for windows users and is becoming a nightmare installation. I had two weeks to run the examples and demonstrate the benchmarks and it seems like at least a week will be burned on installation issues.
In the WSL ubuntu trying to run as in the directions and I get the warning:
Please get Docker Desktop from Docker Desktop - Docker
it doesnt run in WSL and I get errors:
failed to start daemon: Error initializing network controller: error obtaining controller instance: failed to create NAT chain DOCKER: iptables failed: iptables -t nat -N DOCKER: iptables v1.8.4 (legacy): can’t initialize iptables table `nat’: Table does not exist (do you need to insmod?)
Perhaps iptables or your kernel needs to be upgraded.
(exit status 3)
Thanks
Thanks for your help. It is TAO and its associated installation instructions that are instructing Windows + WSL + docker which doesn’t seem compatible.
Are there other configurations that may work? Does Nvidia have any other docker containers with TAO or even better TAO Docker containers for windows?
Do I have to do a virtual machine?
Instructions for each step became complicated for windows, after trying your commands, searching and bumbling around like a fly randomly bumping into things I finally got it working sort-of.
I seem to be running tlt, it warns that I need to update to tao, it says it updates but is still tlt. When I run an ipynb like bpnet I get stuck in weird ways.
For example from the script running
!tao bpnet dataset_convert -m ‘test’ -o $DATA_DIR/val --generate_masks --dataset_spec $DATA_POSE_SPECS_DIR/coco_spec.json
works and creates val-fold-000-of-001
but running the very similar
!tao bpnet dataset_convert -m ‘train’ -o $DATA_DIR/train --generate_masks --dataset_spec $DATA_POSE_SPECS_DIR/coco_spec.json
runs similarly, does not generate an error, but does not create train-fold-000-of-001
that is needed for the notebook.