Choosing to eschew nvue (use Linux instead of the "nvidia user experience")

I have found that utilizing the nv nvue system is quite difficult and inaccessible compared to editing files manually to adjust many settings in parallel, and I want to manage configurations in the approved Configuration as Code system rather than a new secret and yet un-auditable method.

If I want to choose to edit linux files, as in the warning here, and never use nv commands, is the recommended method to disable nvue services systemctl disable --now nvued nvue-startup?

The primary reason for this post is that this is not in the documentation (that I found, anyway) along with the warning about using nvue cli, and I just had a system wipe itself clean on a reboot despite the large and bold comments in the existing config files that editing them would result in nvue no longer updating them (obviously, that was false).

The thought expressed here would have been correct in the very early days of NVUE.
But it is much more integrated into the system now. I wouldn’t recommend disabling it in the way that was suggested, I would instead recommend disallowing it from editing linux flat files which will be edited/controlled via other means (see: NVUE CLI | Cumulus Linux 5.12 ). NV commands against those files will then be no-ops, the show commands for NVUE will still function also in this method which could be helpful in some cases also.

1 Like

OK, thank you for the input on this and the link.

Could you please clarify in the comments in the flat files that the files will be overwritten even if they are edited and place either this link there in the file, or, a comment that specifies that the file must be nv set system config apply ignore’d or something similar?

It was mildly infuriating (and actually caused me to have to take a trip to the datacenter) when the file that says

# Any local modifications will prevent NVUE from re-generating this file.

was literally re-generated/changed on a reboot.

do you have any opinions on apt upgrade vs nv system update or whatever the new command is?

Two more questions here for the experts :)…

  1. if a ztp bash script provisions the system, I assume it would also have to run the ignore commands as well before setting up the flat files, in order to preserve them?
  2. let’s assume that in the future I want to convert from using flat files to nvue – is there a way to pull in the flat file current/existing changes into nvue so I could start using nvue in the future?

Thank you!

5.12 introduces another method too. There are now multiple partitions and the switch can have (2) CL installations and reboot between them. NVUE commands are available to install into the “B” partition while running on the “A” partition. I always recommend a binary upgrade with Cumulus Linux… using apt to do an in-place upgrade is not a good technique at scale for a number of reasons.

1. if a ztp bash script provisions the system, I assume it would also have to run the ignore commands as well before setting up the flat files, in order to preserve them?
Yes, that would be a piece of default configuration that would need to be applied to the system during initial provisioning.

2. let’s assume that in the future I want to convert from using flat files to nvue – is there a way to pull in the flat file current/existing changes into nvue so I could start using nvue in the future?
They are two different configuration mediums and there is no conversion between them.
NVUE’s interaction with Linux configuration files is unidirectional, it generates content that goes into the file but never looks at what’s in the file. We came to the conclusion early-on that it would not be possible to build software to parse every possible configuration setting for 70+ Linux services and so NVUE was designed to look only at the hash of the file to identify changes. The choice of moving between configuration methods Linux → NUVE is a big one and we would always recommend doing changes like this in the virtual setting first with VX, then a physical lab. For someone making this change, I would recommend they consider re-provisioning the switch with a fresh OS running NVUE config natively instead of doing some kind of in-production migration as the later would contain many untested scenarios which would bring more risk to operations.

are the commands to kexec between kernels/partitions available outside of nvue?

the dilemma here is that we have a system that talks directly to the netlink api, so we want flat file management as much as possible to stay in line with the way the control software works.

i appreciate some fancy features, and would like to use them, but we can’t have the nvue system overwriting our changes for switches that are controlled this way.

doesnt nv action upgrade system packages to latest just do apt upgrades? i’m a little confused at why that isnt recommended if there are security & bugfix patches for a minor revision that should be applied.

thank you for the informative responses, they are helping shape our internal policies around this unknown.

“are the commands to kexec between kernels/partitions available outside of nvue?”
You’re simply changing the boot partition, it’s possible to do that with standard linux commands of course but those are hairy commands and the penalty if you get them wrong is an unusable system which is not OK in a production environment. We wrapped the low-level linux commands with some new CL utilities “cl-image-install” to support that, and then further wrapped those utilities with NVUE. We document both the CL-utilities and the NVUE methods here (called “optimized image upgrade”) : Upgrading Cumulus Linux | Cumulus Linux 5.12

the dilemma here is that we have a system that talks directly to the netlink api, so we want flat file management as much as possible to stay in line with the way the control software works.
Cumulus Linux uses a linux-first approach to networking, meaning everything works with the Kernel and typically via Netlink APIs. This has always been true, and I see no signs of that changing. Long live Netlink!

doesnt nv action upgrade system packages to latest just do apt upgrades? i’m a little confused at why that isnt recommended if there are security & bugfix patches for a minor revision that should be applied.
Yes, that’s what it does. I don’t recommend Apt for a number of reasons.
1). APT is external – most of the time folks are not using APT with a local mirror which means that using APT introduces external dependencies outside of your datacenter and control. This may be OK for a small environment but for a large environment is not acceptable. Hosting a local APT mirror is another piece of infra which requires care/feeding. External repos can and do go down – if that happens in the middle of your upgrade it’s not great. Aside from APT repos going down they can also be upgraded in the middle of your upgrade which introduces other undesirable scenarios. Best to avoid all this complexity and use the binary upgrade.
2). APT upgrade is Package based – If you upgrade 150 packages which might involve 20,000 files, on 1000 switches it’s a statistical math problem where eventually one of these files somewhere will be either corrupted or in some other non-deterministic state which will need to be recovered. Better to use binary install methods where the end state is more deterministic.
3). APT is different from New switch Install – if you use binary upgrades, your new switch install process looks identical to your RMA process which is identical to your switch upgrade process-- all this means you have to QA one process for all these workflows instead of 2 which is a win.

These are the biggest reasons.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.