The speed of GPU when running the code error in jetson nano

I use 2GB jetson nano to run the segmentation example code in Jupyter notebook. I cannot finish the code and it will stop in out = net(inp)['out'] and pops up
image

Kernel Restarting
The kernel for jetbot/test-seg.ipynb appears to have died. It will restart automatically.

I check the memory usage, it uses roughly 18xx.
I am think if the memory is not enough?
How should I extend the memory?(I have already used swapfile)

and I am also thinking that I put everything into GPU, why the speed of processing is too slow?
The memory is used by CPU and GPU together?
How can I find the how much memory does GPU and CPU has?

The code is from

https://colab.research.google.com/github/spmallick/learnopencv/blob/master/PyTorch-Segmentation-torchvision/intro-seg.ipynb#scrollTo=shnC_YQLeQ1v
from torchvision import models
fcn = models.segmentation.fcn_resnet101(pretrained=True).eval()
fcn.to("cuda")
import numpy as np
# Define the helper function
def decode_segmap(image, nc=21):
  
  label_colors = np.array([(0, 0, 0),  # 0=background
               # 1=aeroplane, 2=bicycle, 3=bird, 4=boat, 5=bottle
               (128, 0, 0), (0, 128, 0), (128, 128, 0), (0, 0, 128), (128, 0, 128),
               # 6=bus, 7=car, 8=cat, 9=chair, 10=cow
               (0, 128, 128), (128, 128, 128), (64, 0, 0), (192, 0, 0), (64, 128, 0),
               # 11=dining table, 12=dog, 13=horse, 14=motorbike, 15=person
               (192, 128, 0), (64, 0, 128), (192, 0, 128), (64, 128, 128), (192, 128, 128),
               # 16=potted plant, 17=sheep, 18=sofa, 19=train, 20=tv/monitor
               (0, 64, 0), (128, 64, 0), (0, 192, 0), (128, 192, 0), (0, 64, 128)])

  r = np.zeros_like(image).astype(np.uint8)
  g = np.zeros_like(image).astype(np.uint8)
  b = np.zeros_like(image).astype(np.uint8)
  
  for l in range(0, nc):
    idx = image == l
    r[idx] = label_colors[l, 0]
    g[idx] = label_colors[l, 1]
    b[idx] = label_colors[l, 2]
    
  rgb = np.stack([r, g, b], axis=2)
  return rgb
from PIL import Image
import matplotlib.pyplot as plt
import torchvision.transforms as T
def segment(net, path, show_orig=True, dev='cuda'):
  img = Image.open(path)
  if show_orig: plt.imshow(img); plt.axis('off'); plt.show()
  # Comment the Resize and CenterCrop for better inference results
  trf = T.Compose([T.Resize(640), 
                   #T.CenterCrop(224), 
                   T.ToTensor(), 
                   T.Normalize(mean = [0.485, 0.456, 0.406], 
                               std = [0.229, 0.224, 0.225])])
  inp = trf(img).unsqueeze(0).to(dev)
  print(1) 
  net= net.to(dev)
  print(2)
  out = net(inp)['out']
  print(3)
  om = torch.argmax(out.squeeze(), dim=0).detach().cpu().numpy()
  print(4)
  rgb = decode_segmap(om)
  print(5)
  plt.imshow(rgb); plt.axis('off'); plt.show()

I cannot finish this cell.(horse one). It will go to def segment(net, path, show_orig=True, dev='cuda'):
and stuck on out = net(inp)['out'].

!wget -nv https://www.learnopencv.com/wp-content/uploads/2021/01/horse-segmentation.jpeg -O horse.png
segment(fcn, './horse.png')

Thank you so much!!!

Hi,

Jetson’s memory is shared by CPU and GPU.
You can add some swap but this is only accessible via CPU.

Since segmentation is a relatively complicated task, it may be out of the Nano 2GB’s capability.
More, when the system starts to use swap memory, the performance decrease.
Swap is realized with a storage buffer that has lower bandwidth.

Thanks.

Hi,
Is there any ways to extend the memory for jetson nano except swap?
thanks

Hi,

Unfortunately, memory is a fixed hardware.
But we do have a 4GB version of Nano or other devices with much larger memory.

Thanks.