Release Notes for Nvidia Bright Cluster Manager 9.2-12

Release notes for Bright 9.2-12

== General ==
=New Features=

  • Updated cm-nvhpc to version 23.3

=Fixed Issues=

  • An issue with whitelisting user groups in the /etc/security/pam_bright.d/pam_whitelist_group.conf file
  • An issue where the Ubuntu 22.04 head node installations can fail when CUDA packages are selected for installation
  • An issue with starting HPL in cmburn on SLES 15 base distribution
  • An issue where the head node installation may fail if a new custom network has been created using the graphical head node installer interface
  • An issue where the graphical head node installer interface does not validate the network interfaces names specified by the user are unique

== CMDaemon ==
=Improvements=

  • Added a new tool cm-drilldown-query.py for drilldown queries management
  • Added a new ‘–all’ option to the cmsh sysinfo command to show extra information that has been collected by CMDaemon

=Fixed Issues=

  • An issue with calculating the job_gpu_wasted metric when the node has multiple GPUs
  • An issue where CMDaemon does not account for the Shorewall frozen files directives when writing the Shorewall staging files
  • An issue where CMDaemon adds 127.127.1.0 to /etc/chrony.conf, which is not valid for chrony
  • In some cases, CMDaemon crash in the provisioning status code after canceling a provisioning request

== Machine Learning ==
=New Features=

  • Introduced ML package cm-cudnn8.9-cuda12.1 and cm-cudnn8.9-cuda12.0

== cm-kubernetes-setup ==
=Improvements=

  • Improved detection of the base distribution of Ubuntu 22.04 DGX software images
  • Allow the options to setup Kubernetes v1.25 and v1.26

== cm-scale ==
=New Features=

  • Automatically detect memory and GPUs for cloud nodes

=Improvements=

  • Allow the option to use AWS nodes from a different availability zone when AWS nodes in some availability zone cannot be started due to a lack of capacity

=Fixed Issues=

  • Do not replace config.py on package updates

== cm-wlm-setup ==
=Fixed Issues=

  • An issue where cm-wlm-setup is unable to complete the Slurm WLM setup if on Ubuntu base distro the Slurm packages have previously been removed by using the purge package manager option

== cmsh ==
=Fixed Issues=

  • cmsh crash when cloning an entity without specifying a name in the genericresouces submode

== jupyter ==
=Fixed Issues=

  • An issue with restarting the cmjk jupyter kernel

== slurm23.02 ==
=New Features=

  • Use pmix4 with Slurm 23.02