I get the same mysterious link error while building two different PyTorch cpp/cuda extensions on Windows. One is Facebook provided “SparseConvNet” and another named “Pointgroup_ops” by a research group somewhere else.
For the first library, the error is:
They both look very similar. Any insights about the cause/fix? Of course, the code is same as the one that builds on Linux; so, it can’t be the case where something is genuinely not defined. ‘QEBAPEAJXZ’ is not a string that occurs in code.
But, I did not find a solution that has worked. Any help in resolving this would be appreciated by me, and also probably by other people who have encountered this as well.
Thanks.
I smell big trouble. Programmers should not use long anywhere in code that is intended to be portable across platforms. On 64-bit Linux platforms, long resolves to long long (a 64-bit type), while on 64-bit Windows platforms, long resolves to int (a 32-bit type). CUDA maintains type size compatibility between host and device code, so this difference also applies to device code on those two platforms.
QEBAPEAJXZ is presumably just C++ name decoration resulting from data_ptr<long>(void)const. As C++ name mangling is toolchain specific, I don’t think you would want to chase that down.
I suspect the real reason this doesn’t resolve during linking is that the long here is matched up with either int or long long elsewhere, which makes the code link fine on one platform but not another. Can you make the code base “long clean”, i.e. replace all occurrences of long with more appropriate types? Guessing at the intentions, when the programmers used long they likely meant “signed integer type wider than int”, so as an initial attempt one could try replacing all instances of long with long long. You would still need to review whether those changes are in fact appropriate.
If the object being referred to is an actual pointer, a portable way of expressing that as an integer is to use uintptr_t, a type that is guaranteed by C and C++ language standards to be able to hold a pointer of any kind, e.g. for the purpose of performing bit manipulations on pointers.
Thank you! That was a very insightful diagnosis!
I replaced all occurrences of “long” in both the libraries by “int64_t” and the strange link error disappeared and both the libraries built fine. You made my day!