Leaking about 1 GB/day in kernel memory

I have a GeForce RTX 2070 Super on a Linux Certified laptop running
CentOS Linux release 8.2.2004 (Core) 
Kernel 4.18.0-193.14.2.el8_2.x86_64
NVIDIA Driver Version: 450.57

The kernel leaks memory, maybe 1 GB a day.  After a few days,
I have to reboot.  I'm suspicious the nvidia device driver
may be involved although I am not doing any heavy graphics.
As far as I can tell, my graphics/display is working fine.

Been looking for release notes or bugs lists that might 
discuss this, but haven't found anything.  Maybe google is
letting me down.  Could this be a known problem?  With a 
fix in the works?  Or maybe I've unintentionally ended up 
test piloting a new hardware/software combination.

nvidia-bug-report.log.gz (1.6 MB)

No known memory leaks that drastic, at least to me. How did you identify it’s the kernel mem leaking? Statistics available?

Memory use report by top as 'used' keeps ratcheting upwards.
Ditto memory used by report by free.
Ditto MemAvailable reported by /proc/meminfo.
Stopping all applications, or logging out, of course reduces used, and
increases MemAvailable, but not quite back to where it was the day before.
I'm been wondering about this for a couple months and think I've
eliminated tmpfs, slab memeory, shared memory, and memory reported for
individual user processes.  They all look reasonable.
The "Mem:" line from free, reported every hour, even at night when the
laptop is basically idle, looks like

                     total        used        free      shared  buff/cache   available
Mem:          31828       22778         619         338        8431        8252
Mem:          31828       22947         443         338        8437        8082
Mem:          31828       23049         343         338        8435        7981
Mem:          31828       23000         408         338        8420        8030
Mem:          31828       23103         377         338        8348        7927
Mem:          31828       23300         308         338        8220        7730
Mem:          31828       23147         500         338        8180        7883
Mem:          31828       23285         400         338        8143        7745
Mem:          31828       23489         275         338        8063        7541

Do you know of anybody who might have tried Centos 8, Kernel
4.18.0-193.14.2.el8_2.x86_64, and nvidia driver 450.57?  Would you
expect a combination like this to work?  I'm worried I've accidently
gotten out on the bleeding edge.  Is anybody else seeing a similar
behavior?

If there are specific statistics that might shed light I can try to
collect them.

RHEL and clones like Centos/Alibaba Linux/Scientific Linux + nvidia is a very common setup in science and compute clouds. So no problems to expect from that.
Please check first if there’s a kernel update available by running system update.
To start kernel memory analysis, you should look into turning on kmem tracing and using the kmemleak module. Those should give you a hint on where to look at.

Follow up for anybody who finds this thread.  I was unsuccessful turning
on kmemleak (I think it has to be compiled into the kernel), and ran into
problems updating the kernel as well.

With the input that I had a standard configuration and there were no
known large nvidia memory leaks, we looked further afield.  We
installed acpid.x86_64 to clean up an nvidia warning, noticed a
hyper-active kworker/kacpid process, and started looking for an acpi
memory leak angle.  This led us to try

   echo "disable" > /sys/firmware/acpi/interrupts/gpe6F

as a work around to a possible acpi memory leak and low and behold,
the system continued to run fine and the leak seems to have stopped.
Likely no nvidia angle to this at all!  Will continue to monitor.