I try to share pytorch tensor by numba like this and it gives me strange results.
when I read the headline with numba, I get an incorrect result from the network from time to time. I added a delay after converting the tensor and it started working correctly. And I not understand why it happens
backbone_features = torch.cat(backbone_features_list, dim=0) desc = backbone_features.__cuda_array_interface__ shape = desc["shape"] strides = desc.get("strides") dtype = np.dtype(desc["typestr"]) shape, strides, dtype = _prepare_shape_strides_dtype(shape, strides, dtype, order="C") size = cuda.driver.memory_size_from_info(shape, strides, dtype.itemsize) devptr = cuda.driver.get_devptr_for_active_ctx( backbone_features.__cuda_array_interface__["data"]) data = cuda.driver.MemoryPointer( current_context(), devptr, size=size, owner=backbone_features) ipch = devices.get_context().get_ipc_handle(data) desc = dict(shape=shape, strides=strides, dtype=dtype) handle = pickle.dumps([ipch, desc])
Can it happen when new data is loaded before current data was read?
I also try base example with numba.cuda.api:
arr = cuda.to_device(tensor) handle = arr.get_ipc_handle() handle = pickle.dumps(handle)
with handle as ipc_array: z = cuda.open_ipc_array(ipc_array, 1, dtype='float16', strides=None, offset=0) hary = z.args.copy_to_host(stream=cuda.stream())