How to fix the pods which are in Error status

Hello. The status of the pods named nvidia-smi-admin-ops01 and gpu-operator-1680082681-node-feature-discovery-master-8dc9pt4qz became Error when the ephemeral-storage error occured.

However. After I cleaned up the disk space and freed up the space to 49% Use as the picture below, the two pods still in the Error status.

How should I do to fix the two pods which are still in error status ?

The error pods

The description of pod named nvidia-smi-admin-ops01

Name:         nvidia-smi-admin-ops01
Namespace:    default
Priority:     0
Node:         admin-ops01/192.168.101.8
Start Time:   Wed, 29 Mar 2023 08:24:37 +0000
Labels:       run=nvidia-smi-admin-ops01
Annotations:  cni.projectcalico.org/containerID: 1c5d35d135cda4f258b7a5fcf10976e8a3cc2a59cf4981882681c3aac3db526e
              cni.projectcalico.org/podIP: 
              cni.projectcalico.org/podIPs: 
Status:       Failed
Reason:       Evicted
Message:      The node was low on resource: ephemeral-storage. Container nvidia-smi-admin-ops01 was using 104Ki, which exceeds its request of 0. 
IP:           192.168.33.84
IPs:
  IP:  192.168.33.84
Containers:
  nvidia-smi-admin-ops01:
    Container ID:  containerd://996a7b896d91b2c55cf541d799bfc5f4b4656bd107498d99c23feec7d47bf086
    Image:         nvidia/cuda:11.0.3-base
    Image ID:      docker.io/nvidia/cuda@sha256:7258839ddbf814d0d6da6c730293bd4ba7b8d1455da84948bb7e4f10111a8b91
    Port:          <none>
    Host Port:     <none>
    Args:
      sleep
      infinity
    State:          Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Wed, 29 Mar 2023 09:56:10 +0000
      Finished:     Mon, 03 Apr 2023 05:59:38 +0000
    Ready:          False
    Restart Count:  4
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pqf8d (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kube-api-access-pqf8d:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              kubernetes.io/hostname=admin-ops01
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:                      <none>

The description of pod named gpu-operator-1680082681-node-feature-discovery-master-8dc9pt4qz

Name:         gpu-operator-1680082681-node-feature-discovery-master-8dc9pt4qz
Namespace:    gpu-operator
Priority:     0
Node:         admin-ops01/192.168.101.8
Start Time:   Wed, 29 Mar 2023 09:38:03 +0000
Labels:       app.kubernetes.io/instance=gpu-operator-1680082681
              app.kubernetes.io/name=node-feature-discovery
              pod-template-hash=8dc97d954
              role=master
Annotations:  cni.projectcalico.org/containerID: ab97d5111239f0a75c3a6d3441abf56c6afddfd8a16f0dc2c476de204f67cf3a
              cni.projectcalico.org/podIP: 
              cni.projectcalico.org/podIPs: 
Status:       Failed
Reason:       Evicted
Message:      The node was low on resource: ephemeral-storage. Container master was using 972Ki, which exceeds its request of 0. 
IP:           192.168.33.70
IPs:
  IP:           192.168.33.70
Controlled By:  ReplicaSet/gpu-operator-1680082681-node-feature-discovery-master-8dc97d954
Containers:
  master:
    Container ID:  containerd://679e950f4c695821b2af88f6924c086c406c55adb16e65faf17cec70f7167181
    Image:         k8s.gcr.io/nfd/node-feature-discovery:v0.10.1
    Image ID:      k8s.gcr.io/nfd/node-feature-discovery@sha256:4aebf17c8b72ee91cb468a6f21dd9f0312c1fcfdf8c86341f7aee0ec2d5991d7
    Port:          8080/TCP
    Host Port:     0/TCP
    Command:
      nfd-master
    Args:
      --extra-label-ns=nvidia.com
      -featurerules-controller=true
    State:          Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Wed, 29 Mar 2023 09:53:56 +0000
      Finished:     Mon, 03 Apr 2023 05:58:46 +0000
    Ready:          False
    Restart Count:  1
    Liveness:       exec [/usr/bin/grpc_health_probe -addr=:8080] delay=10s timeout=1s period=10s #success=1 #failure=3
    Readiness:      exec [/usr/bin/grpc_health_probe -addr=:8080] delay=5s timeout=1s period=10s #success=1 #failure=10
    Environment:
      NODE_NAME:   (v1:spec.nodeName)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-7lbk7 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kube-api-access-7lbk7:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node-role.kubernetes.io/control-plane:NoSchedule
                             node-role.kubernetes.io/master:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:                      <none>

Excuse me @generix . Are there any methods to fix the pods which are still in error status even if I have already cleaned up the disk space and freed up the space ?

IDK, reboot?