I’m back with an update. I basically thought, nvcc was not going to give me the pointer to those large functions, because of size limitations for non-kernels or something like that and basically gave up on that, leaving the code in it’s former state. Then I started implementing a proper raytracer (whatever I had was a basic test for octree and nothing more serious than that) and encountered something that I can only describe as a bug. The thing is, every time a kernel tried to touch an instance of a lense (that is implemented in the same manner as above), it just failed. OK… I first thought it wasn’t being uploaded properly and reviewed my code for SceneHandler (a class responsible for that and nothing else), but that test functioned properly, calling all the right functions at all the right places and giving me consistent results. Then I tried tinkering and encountered this:
when lense’s “use” function is edited like this:
/*
__dumb__ is my macro for __device__ __host__ inline
(don't ask why; being a personal project I don't really care)
*/
template<typename LenseType>
__dumb__ void LenseFunctionPack::use() {
getScreenPhotonFunction = getScreenPhotonGeneric<LenseType>;
toScreenSpaceFunction = toScreenSpaceGeneric<LenseType>;
//* "unnecessary" code I added starts here
printf("LENSE FUNCTION PACK(%p): ", this);
for (size_t i = 0; i < sizeof(LenseFunctionPack); i++)
printf(" %02X", ((char*)((void*)this))[i] & 0xFF);
printf("\n");
//*/ "unnecessary" code I added ends here
}
if I run the following sanityCheck() function:
__global__ void testSceneHandle(Scene<BakedTriFace> *scene) {
RaycastHit<Shaded<BakedTriFace> > hit;
if (scene->geometry.cast(Ray(Vertex::zero(), Vector3::one()), hit))
printf("Raycast hit something\n");
printf("Raycast hit nothing\n");
PhotonPack result;
bool noShadows;
printf("CALLING scene->lights[0].getPhoton()...\n");
scene->lights[0].getPhoton(Vertex::zero(), &noShadows, result);
printf("CALL FOR scene->lights[0].getPhoton() JUST ENDED\n");
printf("len(illuminationPhotons): %d\n", result.size());
result.clear();
scene->cameras[0].getPhoton(Vector2::zero(), result);
printf("len(screenPhotons): %d\n", result.size());
//*
LenseFunctionPack functions;
functions.use<DefaultPerspectiveLense>();
//*/
}
bool sanityCheck() {
if (cudaSetDevice(0) != cudaSuccess) {
std::cout << "SETTING DEVICE FAILED QUITE A BIT MISERABLY" << std::endl;
return false;
}
Scene<BakedTriFace> scene;
scene.lights.flush(1);
Vector3 direction = Vector3(0.2f, -0.4f, 0.7f).normalized();
scene.lights[0].use<SimpleDirectionalLight>(
Photon(Ray(-direction * 10000.0f, direction),
Color(1.0f, 1.0f, 1.0f)));
scene.cameras.flush(1);
scene.cameras[0].transform.setPosition(Vector3(0, 0, -128));
scene.cameras[0].lense.use<DefaultPerspectiveLense>(60.0f);
SceneHandler<BakedTriFace> sceneHandler(scene);
sceneHandler.uploadToEveryGPU();
testSceneHandle<<<1, 1>>>(sceneHandler.getHandleGPU(0));
bool rv;
if (cudaDeviceSynchronize() != cudaSuccess) {
std::cout << "DEVICE FAILED..." << std::endl;
rv = false;
}
else rv = true;
std::string line;
std::cout << "PRESS ENTER TO CONTINUE...";
std::getline(std::cin, line);
std::cout << std::endl << std::endl << std::endl;
return rv;
}
I get this (It’s not code; just copy-pasted whatever the console gave me):
LENSE FUNCTION PACK(000000914939F010): 50 0B FB AD F6 7F 00 00 60 12 FB AD F6 7F 00 00
LENSE FUNCTION PACK(0000000704000350): 28 00 00 00 00 00 00 00 30 00 00 00 00 00 00 00
Raycast hit nothing
CALLING scene->lights[0].getPhoton()...
LenseTest::Private::Garbage::getScreenPhoton() called on DEVICE
CALL FOR scene->lights[0].getPhoton() JUST ENDED
len(illuminationPhotons): 0
len(screenPhotons): 0
LENSE FUNCTION PACK(00000200F5FFFBB8): 08 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00
PRESS ENTER TO CONTINUE...
Here are several strange things:
- LenseTest::Private::Garbage::getScreenPhoton() is a function from the lense test and there’s no way the compiler would assign it’s address to a light’s function pack;
- LENSE FUNCTION PACK lines from the device(those are 2 and 9) don’t share the same function pointers, but that’s sort of understandable, as the compiler could compile those twice, “thanks” to the templates;
- The kernel does not fail for some reason, but if I return the LenseFunctionPack::use() to it’s original “non-printing” state, the last two lines in the kernel get the compiler optimization treatment and the kernel fails, despite the fact, that the other function pack has the same pointers, if you check them manually.
When the last two lines from the kernel are removed, the output looks like this:
LENSE FUNCTION PACK(000000328775F150): 50 0B C9 8E F7 7F 00 00 60 12 C9 8E F7 7F 00 00
LENSE FUNCTION PACK(0000000704000350): 28 00 00 00 00 00 00 00 30 00 00 00 00 00 00 00
Raycast hit nothing
CALLING scene->lights[0].getPhoton()...
CALL FOR scene->lights[0].getPhoton() JUST ENDED
len(illuminationPhotons): 0
DEVICE FAILED...
PRESS ENTER TO CONTINUE...
As you might see, the “calling another function bug thing” is gone, lense function pack contains the same pointers, but calling them crashes the kernel.
All of these happened with CUDA 9, I haven’t tried CUDA 8 yet and once I do, I’ll return with an update, if something changes.
Clearly, either there’s a bug, or just I do not understand, what’s happening here, so if anyone has any idea about what’s happening here, please give me any suggestions.