Hello,
Some time ago (a couple of weeks maybe?) I noticed that all my programs using graphics libraries (Vulkan, SFML, Allegro) suddenly started leaking memory on exit (as reported by Asan). I was hoping the problem would go away after some system upgrade, but so far it hasn’t.
Apprently, creating and immediately destroying a Vulkan instance is enough to reproduce the leak. Below is a minimal example in C (compiled with clang vkleak.c -o vkleak -Wall -lvulkan -fsanitize=address
):
#include <vulkan/vulkan.h>
#include <stdio.h>
int main(void)
{
VkInstance inst;
VkApplicationInfo app_info = {
.sType = VK_STRUCTURE_TYPE_APPLICATION_INFO,
.pNext = NULL,
.pApplicationName = "leak_test",
.applicationVersion = VK_MAKE_VERSION(0, 0, 1),
.pEngineName = "leak_test",
.engineVersion = VK_MAKE_VERSION(0, 0, 1),
.apiVersion = VK_API_VERSION_1_0,
};
VkInstanceCreateInfo create_info = {};
create_info.sType = VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO;
create_info.pNext = NULL;
create_info.pApplicationInfo = &app_info;
create_info.enabledExtensionCount = 0;
create_info.ppEnabledExtensionNames = NULL;
create_info.enabledLayerCount = 0;
create_info.ppEnabledLayerNames = NULL;
printf("creating Vulkan instance...\n");
if (vkCreateInstance(&create_info, NULL, &inst) != VK_SUCCESS)
printf("failed to create Vulkan instance\n");
vkDestroyInstance(inst, NULL);
printf("destroyed...\n");
return 0;
}
The sanitizer reports a loss of 262524 bytes across 1188 allocations from <unknown module>
. The leaks can be further traced down by attaching the following code as a shared library (clang dlclose_hack.c -o libdlclose_hack.so -Wall -shared -g
):
int dlclose(void *ptr)
{
return 0;
}
Leak report with the shared library:
$ LD_PRELOAD="./libdlclose_hack.so" ./vkleak
creating Vulkan instance...
destroyed...
=================================================================
==14459==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 6304 byte(s) in 4 object(s) allocated from:
#0 0x5581abd6cfa9 in __interceptor_calloc (/home/j/Desktop/vkleak/vkleak+0xcafa9)
#1 0x7f195d66626f (/usr/lib/libnvidia-glcore.so.510.60.02+0xe6626f)
Direct leak of 1024 byte(s) in 1 object(s) allocated from:
#0 0x5581abd6d199 in __interceptor_realloc (/home/j/Desktop/vkleak/vkleak+0xcb199)
#1 0x7f195d6654aa (/usr/lib/libnvidia-glcore.so.510.60.02+0xe654aa)
Indirect leak of 143712 byte(s) in 449 object(s) allocated from:
#0 0x5581abd6cfa9 in __interceptor_calloc (/home/j/Desktop/vkleak/vkleak+0xcafa9)
#1 0x7f195d66626f (/usr/lib/libnvidia-glcore.so.510.60.02+0xe6626f)
Indirect leak of 4921 byte(s) in 372 object(s) allocated from:
#0 0x5581abd6cde9 in __interceptor_malloc (/home/j/Desktop/vkleak/vkleak+0xcade9)
#1 0x7f195d665ddc (/usr/lib/libnvidia-glcore.so.510.60.02+0xe65ddc)
Indirect leak of 752 byte(s) in 6 object(s) allocated from:
#0 0x5581abd6d199 in __interceptor_realloc (/home/j/Desktop/vkleak/vkleak+0xcb199)
#1 0x7f195d6654aa (/usr/lib/libnvidia-glcore.so.510.60.02+0xe654aa)
SUMMARY: AddressSanitizer: 156713 byte(s) leaked in 832 allocation(s).
The leaks seem to come from libnvidia-glcore.so
. The amount of leaked bytes seems to have changed, so I’m not quite sure what to think about that. Nevertheless, Valgrind notices some leaks as well:
$ LD_PRELOAD="./libdlclose_hack.so" valgrind --leak-check=full ./vkleak
==15160== Memcheck, a memory error detector
==15160== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==15160== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==15160== Command: ./vkleak
==15160==
creating Vulkan instance...
destroyed...
==15160==
==15160== HEAP SUMMARY:
==15160== in use at exit: 556,868 bytes in 3,217 blocks
==15160== total heap usage: 14,712 allocs, 11,495 frees, 710,655,137 bytes allocated
==15160==
==15160== 0 bytes in 4 blocks are definitely lost in loss record 1 of 2,618
==15160== at 0x4845899: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15160== by 0x4005492: _dl_find_object_update (in /usr/lib/ld-linux-x86-64.so.2)
==15160== by 0x400D8F7: dl_open_worker_begin (in /usr/lib/ld-linux-x86-64.so.2)
==15160== by 0x4A82E17: _dl_catch_exception (in /usr/lib/libc.so.6)
==15160== by 0x400CD7A: dl_open_worker (in /usr/lib/ld-linux-x86-64.so.2)
==15160== by 0x4A82E17: _dl_catch_exception (in /usr/lib/libc.so.6)
==15160== by 0x400D15C: _dl_open (in /usr/lib/ld-linux-x86-64.so.2)
==15160== by 0x49B174B: dlopen_doit (in /usr/lib/libc.so.6)
==15160== by 0x4A82E17: _dl_catch_exception (in /usr/lib/libc.so.6)
==15160== by 0x4A82EE2: _dl_catch_error (in /usr/lib/libc.so.6)
==15160== by 0x49B124D: _dlerror_run (in /usr/lib/libc.so.6)
==15160== by 0x49B17D7: dlopen@@GLIBC_2.34 (in /usr/lib/libc.so.6)
==15160==
==15160== 48 (24 direct, 24 indirect) bytes in 1 blocks are definitely lost in loss record 2,036 of 2,618
==15160== at 0x484AA83: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15160== by 0x10C6626F: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0x10C5A030: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0x10C6B9E8: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0xF24BD68: ??? (in /usr/lib/libGLX_nvidia.so.510.60.02)
==15160== by 0xF2B20A5: ??? (in /usr/lib/libGLX_nvidia.so.510.60.02)
==15160== by 0xF24B2E2: ??? (in /usr/lib/libGLX_nvidia.so.510.60.02)
==15160== by 0x4E9768F: ???
==15160== by 0x4005E98: call_init (in /usr/lib/ld-linux-x86-64.so.2)
==15160== by 0x4005FCB: _dl_init (in /usr/lib/ld-linux-x86-64.so.2)
==15160== by 0x4A82E74: _dl_catch_exception (in /usr/lib/libc.so.6)
==15160== by 0x400CDDE: dl_open_worker (in /usr/lib/ld-linux-x86-64.so.2)
==15160==
==15160== 128 bytes in 1 blocks are definitely lost in loss record 2,485 of 2,618
==15160== at 0x484AA83: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15160== by 0x10C6626F: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0x10C5B769: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0x10C5A1EE: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0x10C6A5A8: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0xF24BD68: ??? (in /usr/lib/libGLX_nvidia.so.510.60.02)
==15160== by 0xF2B20A5: ??? (in /usr/lib/libGLX_nvidia.so.510.60.02)
==15160== by 0xF24B2E2: ??? (in /usr/lib/libGLX_nvidia.so.510.60.02)
==15160== by 0x4E9768F: ???
==15160== by 0x4005E98: call_init (in /usr/lib/ld-linux-x86-64.so.2)
==15160== by 0x4005FCB: _dl_init (in /usr/lib/ld-linux-x86-64.so.2)
==15160== by 0x4A82E74: _dl_catch_exception (in /usr/lib/libc.so.6)
==15160==
==15160== 571 (128 direct, 443 indirect) bytes in 1 blocks are definitely lost in loss record 2,535 of 2,618
==15160== at 0x484AA83: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15160== by 0x10C6626F: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0x10C5B769: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0x10C58718: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0x10C59D9C: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0x10C6A572: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0xF24BD68: ??? (in /usr/lib/libGLX_nvidia.so.510.60.02)
==15160== by 0xF2B20A5: ??? (in /usr/lib/libGLX_nvidia.so.510.60.02)
==15160== by 0xF24B2E2: ??? (in /usr/lib/libGLX_nvidia.so.510.60.02)
==15160== by 0x4E9768F: ???
==15160== by 0x4005E98: call_init (in /usr/lib/ld-linux-x86-64.so.2)
==15160== by 0x4005FCB: _dl_init (in /usr/lib/ld-linux-x86-64.so.2)
==15160==
==15160== 28,133 (6,024 direct, 22,109 indirect) bytes in 1 blocks are definitely lost in loss record 2,612 of 2,618
==15160== at 0x484AA83: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15160== by 0x10C6626F: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0x10C5D1A5: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0x10C58700: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0x10C59D9C: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0x10C6A572: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0xF24BD68: ??? (in /usr/lib/libGLX_nvidia.so.510.60.02)
==15160== by 0xF2B20A5: ??? (in /usr/lib/libGLX_nvidia.so.510.60.02)
==15160== by 0xF24B2E2: ??? (in /usr/lib/libGLX_nvidia.so.510.60.02)
==15160== by 0x4E9768F: ???
==15160== by 0x4005E98: call_init (in /usr/lib/ld-linux-x86-64.so.2)
==15160== by 0x4005FCB: _dl_init (in /usr/lib/ld-linux-x86-64.so.2)
==15160==
==15160== 127,842 (1,024 direct, 126,818 indirect) bytes in 1 blocks are definitely lost in loss record 2,618 of 2,618
==15160== at 0x484ACD3: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15160== by 0x10C654AA: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0x10C5B800: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0x10C58E6E: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0x10C6B80F: ??? (in /usr/lib/libnvidia-glcore.so.510.60.02)
==15160== by 0xF24BD68: ??? (in /usr/lib/libGLX_nvidia.so.510.60.02)
==15160== by 0xF2B20A5: ??? (in /usr/lib/libGLX_nvidia.so.510.60.02)
==15160== by 0xF24B2E2: ??? (in /usr/lib/libGLX_nvidia.so.510.60.02)
==15160== by 0x4E9768F: ???
==15160== by 0x4005E98: call_init (in /usr/lib/ld-linux-x86-64.so.2)
==15160== by 0x4005FCB: _dl_init (in /usr/lib/ld-linux-x86-64.so.2)
==15160== by 0x4A82E74: _dl_catch_exception (in /usr/lib/libc.so.6)
==15160==
==15160== LEAK SUMMARY:
==15160== definitely lost: 7,328 bytes in 9 blocks
==15160== indirectly lost: 149,394 bytes in 827 blocks
==15160== possibly lost: 0 bytes in 0 blocks
==15160== still reachable: 400,114 bytes in 2,380 blocks
==15160== suppressed: 32 bytes in 1 blocks
==15160== Reachable blocks (those to which a pointer was found) are not shown.
==15160== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==15160==
==15160== For lists of detected and suppressed errors, rerun with: -s
==15160== ERROR SUMMARY: 6 errors from 6 contexts (suppressed: 0 from 0)
As you can see, in both cases the leaks are associated with Nvidia libraries. My friend can reproduce the leaks on his computer with a GTX970 and the same driver version. The issue is not present on my laptop with an AMD GPU, nor another PC with an Intel IGPU.
I am attaching the source code to reproduce the leaks with Vulkan and SMFL as well as the output of nvidia-bug-report.sh
. Can you please look into this?
Thank you in advance!
nvleaks.tar (10 KB)
nvidia-bug-report.log.gz (322.0 KB)
Basic system info:
$ inxi -SGC
System:
Host: jasus Kernel: 5.15.32-1-MANJARO arch: x86_64 bits: 64
Desktop: KDE Plasma v: 5.24.4 Distro: Manjaro Linux
CPU:
Info: 6-core model: Intel Core i7-8700K bits: 64 type: MT MCP cache:
L2: 1.5 MiB
Speed (MHz): avg: 2974 min/max: 800/4700 cores: 1: 800 2: 4364 3: 4400
4: 3603 5: 3501 6: 3485 7: 2988 8: 1877 9: 1022 10: 800 11: 4479 12: 4369
Graphics:
Device-1: NVIDIA TU104 [GeForce RTX 2070 SUPER] driver: nvidia v: 510.60.02
Display: x11 server: X.Org v: 1.21.1.3 with: Xwayland v: 22.1.1 driver:
X: loaded: nvidia gpu: nvidia,nvidia-nvswitch resolution:
1: 2560x1440~144Hz 2: 2560x1440~144Hz
OpenGL: renderer: NVIDIA GeForce RTX 2070 SUPER/PCIe/SSE2
v: 4.6.0 NVIDIA 510.60.02