BCM10.0 – Ubuntu 24.04 (Noble) repositories missing Release files; Kubernetes setup fails (etcd, containerd, etc.)

Hello NVIDIA Team,

I’m deploying Bright Cluster Manager 10.0 (BCM 10.0) on Ubuntu 24.04 (Noble) as part of a BasePOD installation.
During the Kubernetes setup (cm-kubernetes-setup), multiple stages fail due to missing or invalid package repositories.

Error details

E: The repository 'http://updates-us-east.brightcomputing.com/deb/cm/10.0/amd64/ubuntu noble Release' does not have a Release file.
E: The repository 'http://updates-us-east.brightcomputing.com/deb/ml/10.0/amd64/ubuntu noble Release' does not have a Release file.
E: The repository 'https://pkgs.k8s.io/core:/stable:/v1.30/deb  Release' does not have a Release file.
E: The repository 'http://archive.ubuntu.com/ubuntu noble Release' does not have a Release file.

As a result, the Kubernetes setup fails to complete correctly:

  • The etcd module (etcd/kube-default) is never created.

  • etcd-server is not installed on the control-plane image.

  • The containerd package is missing and the service remains DOWN or “socket not ready.”

  • Each following stage (API server bootstrap, GPU operator, etc.) depends on these missing components and fails as well.

🧰 Temporary workarounds attempted

I tried several manual fixes to move the installation forward:

  • Installed etcd-server manually inside /cm/images/k8s-control-plane-image.

  • Created /cm/shared/modulefiles/etcd/kube-default and /cm/shared/apps/etcd/bin manually.

  • Installed containerd manually and recreated /etc/containerd/config.toml.

  • Added module use /cm/shared/modulefiles to /etc/profile.d/modules.sh in the control-plane image.

However, even with these manual steps, each stage of the setup continued to encounter errors related to missing packages or uninitialized services.
The installation never reached a stable or usable Kubernetes cluster state.

🧠 Questions

  • Are updated Bright 10.0 repositories for Ubuntu 24.04 (Noble) available or planned soon?

  • If not, is there an official workaround (alternate repo endpoint or HTTPS mirror) that provides required packages such as etcd, containerd, and nvidia-gpu-operator dependencies?

  • Or is Ubuntu 22.04 (Jammy) currently the only supported base OS for Bright 10.0?

Environment

  • BCM 10.0 (NVIDIA BasePOD reference deployment)

  • Ubuntu 24.04 (Noble)

  • Nodes: bcm10-headnode, master2, knode-01

  • Bright repositories: updates-us-east.brightcomputing.com

  • Internet access: working, but repo endpoints return “Release file missing” errors.

Thanks for confirming current repository support status for Ubuntu 24.04 and guidance on how to proceed with a clean Kubernetes installation.

Best regards,

🔄 Update / Additional Information

After further testing, it seems that the issue may not be directly related to missing Release files, but rather to how cm-kubernetes-setup handles networking during package installation.

Here’s what we’ve confirmed:

  • Both control-plane nodes (master2, knode-01) can successfully reach the internet.

    ping archive.ubuntu.com
    curl -I http://archive.ubuntu.com/ubuntu
    
    

    Both commands work fine — no DNS or routing issues.

  • However, when running the Kubernetes setup (cm-kubernetes-setup), the process fails with:

    connect (101: Network is unreachable)
    E: The repository 'http://archive.ubuntu.com/ubuntu noble Release' does not have a Release file.
    
    

    or sometimes:

    Cannot initiate the connection to archive.ubuntu.com:80
    
    

    even though network connectivity from the nodes is fully functional.

Our investigation suggests that Bright executes apt-get update inside the chrooted image environment
(/cm/images/k8s-control-plane-image) during setup.
Inside that chroot environment:

  • IPv6 is enabled but there is no IPv6 route, causing apt to attempt IPv6 first and fail with Network is unreachable.

  • IPv4 works fine when tested manually.

  • /etc/resolv.conf inside the image is valid (contains 8.8.8.8), so DNS resolution is not the problem.

To summarize:

Networking works correctly on both nodes, but during cm-kubernetes-setup, the apt-get update executed within the Bright-managed image context fails with connection errors.

Thanks for reporting and for finding the root cuase. We’ll see if we can disable IPv6 in the chrooted environment somehow.

Thanks for confirming — that’s great to hear, and I really appreciate your quick follow-up.

This issue has become urgent for us.
We have a customer deployment scheduled soon that depends on a clean Kubernetes setup.
At the moment, this issue inside the chroot environment completely blocks the installation —
cm-kubernetes-setup cannot proceed past the apt-get update stage due to repeated
connect (101: Network is unreachable) errors.

Is there any recommended temporary workaround we can apply on our side until an official fix is released?
Even a short-term method (for example, which file to modify or how to hook into the setup scripts) would be extremely helpful.

Thank you again for looking into this — any guidance or temporary fix would be highly appreciated,
as this is currently a blocking issue for an upcoming customer deployment.

Best regards