Win10, GTX 1050 Ti, CUDA 12.1, VS 2017
part of error messages I post below
my questions are:
- the 481th line of codebook.cu is
loop << <grid_size, REDUCE_ADD_WIDTH, REDUCE_ADD_WIDTH * sizeof(uint64_t) >> >, so the invalid global write located inside the global function “loop”?
- “Address 0x180080bd60 is out of bounds” means the program tried to write to 0x180080bd60? If so, how to find who tried to write to the address
- As for error 719 from cudaMemcpy and cudaFree, it only occures in memcheck, otherwise no error is reported. So is this really an error, or something about tdrdelay?
========= Invalid __global__ write of size 4 bytes ========= at 0x13c8 in G:/cuda/codebook.cu:481:solution(etc1s_optimizer_state_tag *, int, unsigned int, const rgba *, const unsigned int *, etc1s_optimizer_solution_coordinates_tag, etc1s_optimizer_potential_solution_tag *, etc1s_optimizer_potential_solution_tag *) ========= by thread (2,0,0) in block (36,0,0) ========= Address 0x180080bd60 is out of bounds ========= and is 47,969 bytes after the nearest allocation at 0x1800800000 of size 512 bytes ========= Device Frame:G:/cuda/codebook.cu:667:fit(unsigned int, etc1s_optimizer_state_tag *, encode_etc1s_param_struct_tag, unsigned int, const rgba *, const unsigned int *) [0xf90] ========= Device Frame:G:/cuda/codebook.cu:740:cluster(encode_etc1s_param_struct_tag, const pixel_cluster_tag *, const rgba *, const unsigned int *, etc_block_tag *, int) [0x978] ========= Saved host backtrace up to driver entry point at kernel launch time ========= Host Frame:cuEventRecordWithFlags [0x7ff972c54db8] ========= in C:\Windows\system32\DriverStore\FileRepository\nv_dispui.inf_amd64_c3352d3df1cf4d8c\nvcuda64.dll ...... ...... ========= Program hit cudaErrorLaunchFailure (error 719) due to "unspecified launch failure" on CUDA API call to cudaMemcpy. ...... ...... ========= Program hit cudaErrorLaunchFailure (error 719) due to "unspecified launch failure" on CUDA API call to cudaFree.