I have problem here in where when i wanted to train the model, it crashes and stuck. tried with workers=1 and batch-size=4 with epochs=1, it still crashes and reboot the Jetson.
The next time i tried again, they said got errors where torch couldn’t be found in the module.
Thank you!
Hi,
Do you have PyTorch installed?
You can find the instructions below:
Below are pre-built PyTorch pip wheel installers for Jetson Nano, TX1/TX2, Xavier, and Orin with JetPack 4.2 and newer.
Download one of the PyTorch binaries from below for your version of JetPack, and see the installation instructions to run on your Jetson. These pip wheels are built for ARM aarch64 architecture, so run these commands on your Jetson (not on a host PC). You can also use the containers from jetson-containers .
PyTorch pip wheels
JetPack 6
PyTorch v2.2.0 JetPack 6.0 DP (L4T R3…
Thanks.
hello yes i do have PyTorch installed but still gives the same output
may i know what is the problem?
I tried to reinstall, yet they cannot launch. May i know why?
Hi @n.syafiqahme , due to the error about not having setuptools, you can try apt-get install python3-setuptools
Also, if you have much problem with installing PyTorch, I recommend trying the jetson-inference docker container, which already has PyTorch/ect pre-installed in it:
<img src="https://github.com/dusty-nv/jetson-inference/raw/master/docs/images/deep-vision-header.jpg" width="100%">
<p align="right"><sup><a href="jetpack-setup-2.md">Back</a> | <a href="building-repo-2.md">Next</a> | </sup><a href="../README.md#hello-ai-world"><sup>Contents</sup></a>
<br/>
<sup>System Setup</sup></p>
# Running the Docker Container
Pre-built Docker container images for this project are hosted on [DockerHub](https://hub.docker.com/r/dustynv/jetson-inference/tags). Alternatively, you can [Build the Project ](building-repo-2.md) from source.
Below are the currently available container tags:
| Container Tag | L4T version | JetPack version |
|-----------------------------------------------------------------------------------------|:-----------:|:--------------------------------:|
| [`dustynv/jetson-inference:r35.3.1`](https://hub.docker.com/r/dustynv/jetson-inference/tags) | L4T R35.3.1 | JetPack 5.1.1 |
| [`dustynv/jetson-inference:r35.2.1`](https://hub.docker.com/r/dustynv/jetson-inference/tags) | L4T R35.2.1 | JetPack 5.1 |
| [`dustynv/jetson-inference:r35.1.0`](https://hub.docker.com/r/dustynv/jetson-inference/tags) | L4T R35.1.0 | JetPack 5.0.2 |
| [`dustynv/jetson-inference:r34.1.1`](https://hub.docker.com/r/dustynv/jetson-inference/tags) | L4T R34.1.1 | JetPack 5.0.1 |
| [`dustynv/jetson-inference:r32.7.1`](https://hub.docker.com/r/dustynv/jetson-inference/tags) | L4T R32.7.1 | JetPack 4.6.1 |
| [`dustynv/jetson-inference:r32.6.1`](https://hub.docker.com/r/dustynv/jetson-inference/tags) | L4T R32.6.1 | JetPack 4.6 |
| [`dustynv/jetson-inference:r32.5.0`](https://hub.docker.com/r/dustynv/jetson-inference/tags) | L4T R32.5.0 | JetPack 4.5 |
This file has been truncated. show original
1 Like
It seems like the cat_dog training is not working and it suddenly reboots. May I know why is that happening even though i did swap files and my memories has 80GB left?
Thanks
Can you keep an eye on the memory usage in another terminal window by running tegrastats? My guess is that it is running low on memory. Training takes a lot of memory and is a stretch to get working in 4GB memory, so close those chrome tabs and everything. Ideally you would disable the Jetson’s desktop entirely for this step to save additional memory and processor utilization, and SSH into it from a PC.
<img src="https://github.com/dusty-nv/jetson-inference/raw/master/docs/images/deep-vision-header.jpg" width="100%">
<p align="right"><sup><a href="depthnet.md">Back</a> | <a href="pytorch-cat-dog.md">Next</a> | </sup><a href="../README.md#hello-ai-world"><sup>Contents</sup></a>
<br/>
<sup>Transfer Learning</sup></s></p>
# Transfer Learning with PyTorch
Transfer learning is a technique for re-training a DNN model on a new dataset, which takes less time than training a network from scratch. With transfer learning, the weights of a pre-trained model are fine-tuned to classify a customized dataset. In these examples, we'll be using the <a href="https://arxiv.org/abs/1512.03385">ResNet-18</a> and [SSD-Mobilenet](pytorch-ssd.md) networks, although you can experiment with other networks too.
<p align="center"><a href="https://arxiv.org/abs/1512.03385"><img src="https://github.com/dusty-nv/jetson-inference/raw/master/docs/images/pytorch-resnet-18.png" width="600"></a></p>
Although training is typically performed on a PC, server, or cloud instance with discrete GPU(s) due to the often large datasets used and the associated computational demands, by using transfer learning we're able to re-train various networks onboard Jetson to get started with training and deploying our own DNN models.
<a href=https://pytorch.org/>PyTorch</a> is the machine learning framework that we'll be using, and example datasets along with training scripts are provided to use below, in addition to a camera-based tool for collecting and labeling your own training datasets.
## Installing PyTorch
If you are [Running the Docker Container](aux-docker.md) or optionally chose to install PyTorch back when you [Built the Project](building-repo-2.md#installing-pytorch), it should already be installed on your Jetson to use. Otherwise, if you aren't using the container and want to proceed with transfer learning, you can install it now:
``` bash
This file has been truncated. show original
system
Closed
April 24, 2024, 6:26am
10
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.