Debug segfault in libnvvm

Hello!

I am having a bit of an issue, libnvvm is segfaulting on a big bitcode file and im not quite sure how to go about debugging it. The verifier yields nothing, the segfault happens when compiling. verifying the module with llvm says it is valid too, so i am not quite sure what is going on, and i cannot debug libnvvm itself because it has no debug symbols.

I’m going to assume that you have some other, perhaps shorter input file that your process can handle correctly. If not, you may be simply misusing/misapplying libnvvm.

This would be roughly analogous to the case where some aspect of the compile toolchain (e.g. nvcc or one of its sub-tools) was segfaulting during compilation. I can’t speak to your case specifically, but with respect to nvcc that is never expected behavior. So the usual advice there is to file a bug. If you choose to do that, you’ll likely be asked for a full test case (just setting expectations.)

I’m not able to offer further debug suggestions other than trial-and-error/divide-and-conquer/binary search. Others may have ideas.

@Robert_Crovella If you are familiar with rust the language, i am writing a rustc backend to generate NVVM IR so that i can write gpu kernels using rust. The issue is that when it gets to the final stage of calling libnvvm, it segfaults. I am giving it 5 modules in total, the program, compiler_builtins (which is basically just i128 emulation functions), core (a gigantic module, about a hundred kloc or so in llvm ir), libdevice, and another custom module which wraps compiler_builtins as well as emulates i8 overflowing intrinsics using i16 llvm intrinsics.

The issue happens only with compiler_builtins and core, the program and others compile just fine. compiler_builtins is about 38kloc of functions that call eachother, so isolating specific parts would be a challenge.

The reason i believe this should be a bug is because it is not caught by the verifier, the compiler just segfaults when compiling. And while it is almost definitely caused by invalid IR, i believe it should catch this more gracefully. I could submit a bug report with the compiler_builtins file, it should be very easy to debug with debug symbols since it is trying to dereference a null pointer according to lldb.

I’m not familiar with rust. I understand the concept you present, however.

Filing a bug seems reasonable/plausible to me based on your description. I’ve already tried to set expectations. Your suggestion for debug may work. However, if they ask for a full repro case, I can’t sort that out for you.

Yeah i understand, i’ll file a report with the full compiler_builtins file for now, and ill try to make a minimal test case in the meantime, hopefully this is a simple case that can be patched so users don’t experience this in the future.

🎉 @Robert_Crovella I seem to have found a minimal reproducible test case:

target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v32:32:32-v64:64:64-v128:128:128-n16:32:64"
target triple = "nvptx64-nvidia-cuda"

define void @foo() {
start:
  %0 = call i128 @cuda_test_wrapping_add(i128 10, i128 65)
  ret void
}

define i128 @cuda_test_wrapping_add(i128, i128) {
start:
  %2 = add i128 %0, %1
  ret i128 %2
}

This makes me think that this is an issue with the ptx calling convention, i read over the NVVM IR docs for this and i dont think there is an issue, but perhaps i missed something. Either ways this should be caught by the verifier instead of yielding a segfault.

Specifying alignment explicitly does not fix it either, thats the only thing i thought the calling convention might require for this.

Also a couple of notes:

  • the operation doesnt matter, add, sub, etc all segfault
  • performing the add in foo instead of calling a separate function fixes it
  • not doing the add/sub/etc fixes it (e.g. ret i128 %0)
  • specifying alignment doesnt fix it
  • returning %0 from foo doesnt fix it

I suggest (if you file a bug) provide a complete test case. Soup to nuts. Since libnvvm is just a library, it is not a standalone compiler by itself: Provide the application you are using in source form, that calls libnvvm to compile this code, and provide complete instructions for building that application. And of course provide your input files.

Just suggestions.

I am using my own rust bindings to libnvvm, and since C/C++ is the primary language used in CUDA, it would just yield extra noise having to build the rust bindings and stuff. However, the issue can be reproduced by just copy and pasting the llvm ir into the libnvvm “simple” example. I have already submitted a bug report (ID 3345111).

For some reason the bug description deleted all of the newlines so it looks really bad o.O
Anyways, this happens when compiling, so just copy the LLVM IR file into simple-gpu64.ll and build the sample and run it as you normally would with cmake.

So it seems like the issue is worse than i thought, it appears as if basically any function call that takes i128 and does something inside will cause libnvvm to segfault, which means i128 is for the most part unusable :/

this may possibly be of interest

That does not really apply to this case because i have no control over what nvvm is doing, and i cannot know what is going wrong because there are do debug symbols or source code