I want to transfer some image data to GPU, proccess it and get back. I’m using GPU VSIPL library, and having this error
wrap_cudaMalloc() failed to allocate block of 99532800 bytes on device
in this code after copy to GPU by vsip_mcopyfrom_user_f
vsip_mview_f *imgc;
float *imgm;
imgc = vsip_mcreate_f(height, width, VSIP_ROW, VSIP_MEM_NONE);
vsip_mcopyfrom_user_f(imgm,VSIP_ROW,imgc);
vsip_mcopyto_user_f(imgc,VSIP_ROW,imgm);
vsip_malldestroy_f(imgc);
size of image data is 3heightwidth = 3(RGB channels) * 2160(height) * 3840(width) * 4(sizeof float) = 99532800 bytes
Smaller size are correctly transfered, I think the edge is about 3 000 000 bytes. And this error was even when I used simple CUDA functions
Also sometimes I have an error such as “no CUDA-capable device is available” and need to reboot my OS.
KUbuntu 9.04, CUDA 2.3, GeForce 9600 GT (512 MB) - version190.18