It is possible for the CUDA compiler to run out of memory when compiling extremely large functions like yours. However this should be caught and a proper error message should be returned before the compiler terminates. It is unusual (as in: I have never seen it) that this would result in a signal 9, which is usually due to action taken by a user (e.g. who killed the running
ptxas due to excessive runtime).
I would suggest filing a bug report with NVIDIA. As a workaround for now, try reducing the optimization level used by
ptxas. The default is -Xptxas -03, so try -Xptxas -02, then -Xptxas -01, etc.
A single function comprising 88K lines seems like an engineering nightmare. If there is a bug in the code, it would present a very tough debugging challenge.