Warning Unhealthy kubelet Startup probe failed: Get "v1/health/ready": dial tcp 10.1.124.81:8000: connect: connection refused

Hi,I’m so sorry for not getting back to you until this evening.
The service has been started, thank you very much for your help.
The following is my process of solving the problem. I hope it can help others if they encounter this problem again.
Note: I modified this based on the configuration of 4 graphics cards.

  1. You need a proxy that can access foreign networks.
  2. After installing microk8s, configure the proxy and restart microk8s
    vi /var/snap/microk8s/current/args/containerd-env
    Please configure your own proxy:
  3. Complete the official website tutorial steps
  4. Update the configuration. This is based on solution and the proxy configuration file has been added.
    I’m sorry I don’t have permission to upload yaml, I added the suffix “log”, please be careful
    override-proxy.yaml.log (2.6 KB)
    sudo microk8s helm upgrade --install vss-blueprint nvidia-blueprint-vss-2.2.0.tgz --set global.ngcImagePullSecretName=ngc-docker-reg-secret -f override-proxy.yaml
  5. After the data that the pod needs to download is downloaded, modify all detection times in the configuration file except vss-deployment to the default time or a shorter time.

    Update configuration
    sudo microk8s helm upgrade --install vss-blueprint nvidia-blueprint-vss-2.2.0.tgz --set global.ngcImagePullSecretName=ngc-docker-reg-secret -f override-proxy.yaml
  6. vss-vss-deployment will start only after other pods are started. After starting, it will download data, and there may be exceptions during the download process.
    You can modify the domestic image repository, or manually download and import
    The following is provided by yuweiw, thanks again
    containerd
    Warning Unhealthy kubelet Startup probe failed: Get "v1/health/ready": dial tcp 10.1.124.81:8000: connect: connection refused - #28 by yuweiw
    docker
    Warning Unhealthy kubelet Startup probe failed: Get "v1/health/ready": dial tcp 10.1.124.81:8000: connect: connection refused - #23 by yuweiw
    After downloading, you need to manually import containerd of k8s.
  7. After the download is complete, delete the vss-engine proxy and modify the pod detection time.
    sudo microk8s kubectl edit deployment vss-vss-deployment
  8. After the modification is completed, a pod will be automatically started. VSS may not delete the original pod. Just scale the pod once.

suggestion:
It is best to check the network traffic. The download process may be long and there is no corresponding log to determine the download progress.

I don’t have the resources to verify this process again, so the whole process may not be so rigorous. I just hope to provide some ideas for you to solve the problem.

1 Like