Please refer to this github repo to gain some more insights on the error that I keep on getting - Accelerated ray tracing using cuda
Ok I know that just giving out a link to a repo is not the right way to ask your doubts , but I will try to localize the error as much as I can in my question -
So I have been enhancing and trying to run this code on my laptop - (GTX - 1660ti and Cuda v- 12) , well for most of the parts the code works perfectly fine but for larger no of iterations - i.e. for larger images or for higher no of samples per pixels , I used to get this sporadic error which gave off an Error - 716 code , now on running with compute-sanitizer at that time this kind of message popped up -
========= COMPUTE-SANITIZER
========= Invalid __local__ write of size 4 bytes
========= at 0x3e60 in D:/CudaProjects/raytracinginoneweekendincuda-
ch12_where_next_cuda/material.h:52:render(vec3 *, int, int, int, camera
**, hitable **, curandStateXORWOW *)
========= by thread (4,1,0) in block (128,1,0)========= Address
0xfffb7a is misaligned
========= Saved host backtrace up to driver entry point at kernel
launch time
========= Host Frame:cuEventRecordWithFlags [0x7ffbc0ecc7b5]
========= in
C:\WINDOWS\system32\DriverStore\FileRepository
\nvam.inf_amd64_4c9ded46d0fbe1
f8\nvcuda64.dll
========= Host Frame: [0x1d46]
========= in
D:\CudaProjects\raytracinginoneweekendincuda-ch12_where_next_cuda\a.exe
-- More --
Now this line in the repo is actually an initializtion of a ray object -
scattered = ray(rec.p, target-rec.p);
Well at that time after asking around a lot , and getting no where near to the root of the problem , I decided to move on to adding more features to the project , like adding more textures , materials new shapes , but the misalignment error never left me , It was always lurking in the background popping up here and there for higher no of iterations.
Now that I am reaching the end of my project I again tried to look into this error and found some very weird patterns -
Here is an error code which traces itself to the ray.direction() function in the code which was raised in the hit function of a traingle renderer that I coded (this may sound very annoying but I cannot share the whole code)
Here the grid size is - 512X512
and the block size is - 8X8
(For the render kernel)
========= COMPUTE-SANITIZER
Started creation and pre processing of data
Pre processing and creation of the world took 0.077
Rendering a 512x512 image
========= Invalid __local__ read of size 8 bytes
========= at 0x145a0 in C:/Users/sonas/Documents/Capstone 2022-23/raytracingcuda/ray.cuh:14:ray::direction() const
========= by thread (4,1,0) in block (135,0,0)
========= Address 0xfffae2 is misaligned
========= Device Frame:C:/Users/sonas/Documents/Capstone 2022-23/raytracingcuda/triangle.cuh:49:render(vec3 *, int, int, int, camera **, hittable **, curandStateXORWOW *) [0x145a0]
========= Device Frame:C:/Users/sonas/Documents/Capstone 2022-23/raytracingcuda/hittable_list.cuh:35:render(vec3 *, int, int, int, camera **, hittable **, curandStateXORWOW *) [0x86c0]
========= Device Frame:C:/Users/sonas/Documents/Capstone 2022-23/raytracingcuda/main.cu:33:color(const ray &, hittable **, curandStateXORWOW *, vec3 &) [0xf40]
========= Device Frame:C:/Users/sonas/Documents/Capstone 2022-23/raytracingcuda/main.cu:114:render(vec3 *, int, int, int, camera **, hittable **, curandStateXORWOW *) [0xa90]
Now this hit function is very similar to the hit function of the sphere renderer defined in the repo that I shared , and the line where this error popped up simply said -
vec3 r_d = r.direction();
Now it baffles me how a simple initialization can lead to a misaligment error and too an error which is sporadic in behaviour. now I tried multiple times with multiple shapes and found some interesting patterns -
-
This error always happens in the thread no (4,1,0) the block varies everytime but since it says “invalid local read” I guess it has something to do with the local memory defined for each thread?
-
This error always pops up where I am either initializing a ray object , or assigning a ray object or calling any of its data members - like r.direction()
-
This one is very weird but I tried calling r.origin() first to see if its all the data members that are acting up , but still the error popped up from the r.direction() function
Here are some of the things that I tried to fix this error -
-
Someone told me that in the vec3 class as I am using float[3] array , and as cuda supports reading of words in 1,2,4,8 or 16 bytes this mismatch may be causing the error , I tried changing this to float3 first and then just using three float variables, but none seemed to work for me and the error still stayed with similar messages
-
I tried using the align(16) notation before each class decleration to somehow fix this sporadic error(this was my very desperate move) and voila it still didnt work
-
I tried eliminating the member functions entirely and just tried to access the data object directly but still faced the same issues
Now as you can see by these methods of mine , I literally have no idea now to how actually even try to fix this problem , whichever article or answer i refer too, has some very clear misalignment issues like casting a variable to some other type of variable and on top of that they are not sporadic, I do not wish for a very clear cut answer but if someone could point me out to some resource or some article which can help me to learn and understand more about this error , I will forever be grateful.
I know it was a long question and thank you for reading it to the end , cheers!