Fifo Reset Issue in GPU

Hi Everyone,
First of all I am new to CUDA Programming. Currently, I am trying to manipulate pixels parallely using CUDA. With my logic, I am able to process images with same width and height. But I face problems when the images have different width and height(especially 1920x1080). I dont have clues regarding this problem.

My Logic,

unsigned int *d_buffer = 0;
cudaMalloc((void**)&d_buffer, dataSize);
cudaMemcpy(d_buffer, bitmapData, dataSize,cudaMemcpyHostToDevice);

dim3 threadsPerBlock(16,16);
dim3 blocksPerGrid((width + threadsPerBlock.x - 1) / threadsPerBlock.x,
    (height + threadsPerBlock.y - 1) / threadsPerBlock.y);

convertRGBtoGRAY<<<blocksPerGrid, threadsPerBlock>>>(d_buffer, size, width);
cudaDeviceSynchronize();
cudaMemcpy(outData, d_buffer, dataSize, cudaMemcpyDeviceToHost);
cudaFree(d_buffer);

===========================================================================================

My Kernel,

global void convertRGBtoGRAY(unsigned int *bitmapData, unsigned long size, unsigned int width){

           ........................................................
	unsigned long dataIndex = (xIndex + (yIndex) * width);
           ..........................................................

}

===============================================================================================

Error Log,
[18825.698583] ---- mlocks ----
[18825.704252]
[18825.707366] ---- syncpts ----
[18825.710348] id 1 (disp0_a) min 477966 max 477967
[18825.714975] id 2 (disp0_b) min 477967 max 477967
[18825.719590] id 3 (disp0_c) min 477967 max 477968
[18825.724212] id 4 (disp0_d) min 477967 max 477968
[18825.728827] id 5 (dsi) min 3 max 0
[18825.732231] id 6 (disp1_a) min 477896 max 477898
[18825.736841] id 7 (disp1_b) min 477896 max 477898
[18825.741450] id 8 (disp1_c) min 477896 max 477898
[18825.746067] id 9 (vic03.0_0) min 3969 max 3969
[18825.750501] id 11 (gk20a.0_1) min 3149342 max 3149354
[18825.755554] id 12 (gk20a.0_4) min 700849 max 700852
[18825.760432] id 13 (gk20a.0_6) min 20211 max 20211
[18825.765128] id 14 (gk20a.0_11) min 1491956 max 1491956
[18825.770307] id 15 (gk20a.0_9) min 7 max 7
[18825.774312] id 16 (gk20a.0_13) min 152362 max 152363
[18825.779268] id 17 (gk20a.0_14) min 6666 max 6666
[18825.783879] id 26 (vblank0) min 1127734 max 0
[18825.788228] id 27 (vblank1) min 1128734 max 0
[18825.792722]
[18825.794208] ---- channels ----
[18825.797254] 1 channel of tsec unmapped
[18825.800994] 1 channel of isp.0 unmapped
[18825.804820] 1 channel of isp.1 unmapped
[18825.808657] 0-vic03.0 (0): inactive

[18825.812331] 1 channel of msenc unmapped
[18825.816406] gk20a.0 pbdma 0:
[18825.819217] id: 4 (channel), next_id: 4 (channel) status: invalid
[18825.825492] PUT: 000000040160a720 GET: 0000000401609ca8 FETCH: 00000d2a HEADER: 20000df8
[18825.833571]
[18825.835059] gk20a.0 eng 0:
[18825.837678] id: 13 (channel), next_id: 4 (channel), ctx: ctxsw_switch faulted
[18825.844895] busy [18825.846818]
[18825.848303] gk20a.0 eng 1:
[18825.850916] id: 13 (channel), next_id: 13 (channel), ctx: valid [18825.856912]
[18825.858396]
[18825.859882] 0-gk20a.0, pid 140:
[18825.862935] in use idle not busy
[18825.866422] TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
[18825.866422] HEADER: 60400000 COUNT: 00000000
[18825.866422] SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000
[18825.887674]
[18825.889160] 1-gk20a.0, pid 140:
[18825.892205] in use pending busy
[18825.895604] TOP: 80000000081012b0 PUT: 00000000081012b0 GET: 00000000081012b0 FETCH: 00000000081012b0
[18825.895604] HEADER: 60400000 COUNT: 80000000
[18825.895604] SYNCPOINT 00000000 00000b01 SEMAPHORE 00000004 000e0600 00000000 00000002
[18825.916858]
[18825.918345] 2-gk20a.0, pid 1451:
[18825.921477] in use idle not busy
[18825.924964] TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
[18825.924964] HEADER: 60400000 COUNT: 00000000
[18825.924964] SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000
[18825.946222]
[18825.947710] 3-gk20a.0, pid 4124:
[18825.950842] in use idle not busy
[18825.954328] TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
[18825.954328] HEADER: 60400000 COUNT: 00000000
[18825.954328] SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000
[18825.975578]
[18825.977065] 4-gk20a.0, pid 15237:
[18825.980289] in use on_eng_pending busy
[18825.984299] TOP: 8000000401609ca8 PUT: 000000040160a720 GET: 0000000401609ca8 FETCH: 000740040160a380
[18825.984299] HEADER: 20000df8 COUNT: 01110002
[18825.984299] SYNCPOINT 00074b0e 00000210 SEMAPHORE 00000004 00960600 00000000 00000002
[18826.005552] Waited on syncpt 2 (disp0_b) val 477966
[18826.010418]
[18826.011903] 5-gk20a.0, pid 930:
[18826.014952] in use idle not busy
[18826.018439] TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
[18826.018439] HEADER: 60400000 COUNT: 00000000
[18826.018439] SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000
[18826.039693]
[18826.041179] 6-gk20a.0, pid 930:
[18826.044223] in use idle not busy
[18826.047711] TOP: 8000000008106ce0 PUT: 0000000008106ce0 GET: 0000000008106ce0 FETCH: 0000000008106ce0
[18826.047711] HEADER: 60400000 COUNT: 80000000
[18826.047711] SYNCPOINT 00000000 00000d01 SEMAPHORE 00000004 00120600 00000000 00000002
[18826.068966]
[18826.070452] 7-gk20a.0, pid 10165:
[18826.073674] in use idle not busy
[18826.077160] TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
[18826.077160] HEADER: 60400000 COUNT: 00000000
[18826.077160] SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000
[18826.098412]
[18826.099898] 8-gk20a.0, pid 11526:
[18826.103117] in use idle not busy
[18826.106606] TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
[18826.106606] HEADER: 60400000 COUNT: 00000000
[18826.106606] SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000
[18826.127853]
[18826.129339] 9-gk20a.0, pid 627:
[18826.132388] in use idle not busy
[18826.135874] TOP: 80000000081da070 PUT: 00000000081da070 GET: 00000000081da070 FETCH: 00000000081da070
[18826.135874] HEADER: 60400000 COUNT: 80000000
[18826.135874] SYNCPOINT 00000000 00000f01 SEMAPHORE 00000004 00860600 00000000 00000002
[18826.157124]
[18826.158610] 10-gk20a.0, pid 14088:
[18826.161914] in use idle not busy
[18826.165404] TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
[18826.165404] HEADER: 60400000 COUNT: 00000000
[18826.165404] SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000
[18826.186651]
[18826.188136] 11-gk20a.0, pid 14088:
[18826.191444] in use idle not busy
[18826.194935] TOP: 80000000080f7190 PUT: 00000000080f7190 GET: 00000000080f7190 FETCH: 00000000080f7190
[18826.194935] HEADER: 60400000 COUNT: 80000000
[18826.194935] SYNCPOINT 00000000 00000e01 SEMAPHORE 00000004 011c0600 000006af 00001004
[18826.196985] fence timeout on [dd6ac2c0] after 500ms
[18826.196990] name=[11:gk20a.0_1], current value=3149342 waiting value=3149347
[18826.215043] fence timeout on [da605440] after 500ms
[18826.215046] name=[12:gk20a.0_4], current value=700849 waiting value=700852
[18826.239819]
[18826.241319] 12-gk20a.0, pid 15237:
[18826.244625] in use idle not busy
[18826.248112] TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
[18826.248112] HEADER: 60400000 COUNT: 00000000
[18826.248112] SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000
[18826.269359]
[18826.270852] 13-gk20a.0, pid 15237:
[18826.274156] in use on_eng_pending busy
[18826.278163] TOP: 8000000100009f50 PUT: 0000000100009f50 GET: 0000000100009f50 FETCH: 0000000100009f50
[18826.278163] HEADER: 60400000 COUNT: 80000000
[18826.278163] SYNCPOINT 00000000 00001001 SEMAPHORE 00000000 00000000 00000000 00000000
[18826.299409]
[18826.300894] 14-gk20a.0, pid 15237:
[18826.304199] in use idle not busy
[18826.307692] TOP: 80000001000f1018 PUT: 00000001000f1018 GET: 00000001000f1018 FETCH: 00000001000f1018
[18826.307692] HEADER: 60400000 COUNT: 80000000
[18826.307692] SYNCPOINT 00000000 00001101 SEMAPHORE 00000000 00000000 00000000 00000000
[18826.328939]
[18826.330425] 15-gk20a.0, pid 15237:
[18826.333730] in use idle not busy
[18826.337216] TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
[18826.337216] HEADER: 60400000 COUNT: 00000000
[18826.337216] SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000
[18826.358468]
[18826.359969] gk20a gk20a.0: gk20a_fifo_handle_mmu_fault: mmu fault on engine 0, engine subid 0 (gpc), client 0 (l1 0), addr 0x00000000:0x9d810
[18826.359969]
[18826.378799] gk20a gk20a.0: gk20a_fifo_set_ctx_mmu_error: channel 13 generated a mmu fault
[18826.386963] gk20a gk20a.0: gk20a_set_error_notifier: error notifier set to 31
[18826.386963]
[18826.397535] gk20a gk20a.0: fifo_error_isr: channel reset initated from fifo_error_isr
[18826.408123] gk20a gk20a.0: gk20a_fifo_handle_sched_error: fifo sched error : 0x0000000a, engine=2, ch=-1
[18826.417610] gk20a gk20a.0: fifo_error_isr: channel reset initated from fifo_error_isr

Is this running on a Jetson TK1 platform?
What happens if you run your code with cuda-memcheck ?

Also, you should add proper cuda error checking to your code. If you’re not sure what that is, google “proper cuda error checking” and take the first hit.

Hi Bob,
Yes I am trying CUDA in Jetson TK1. After adding error check messages, I got the following log,

D/CUDA_CORE( 7118): [CUDA-ERROR] cudaDeviceSynchronize() returned error code 4 after launching kernel!
D/CUDA_CORE( 7118): [CUDA-ERROR] Copying memory failed!

What does this specify?

Hi Bob,
Yes I am trying CUDA in Jetson TK1. After adding error check messages, I got the following log,

D/CUDA_CORE( 7118): [CUDA-ERROR] cudaDeviceSynchronize() returned error code 4 after launching kernel!
D/CUDA_CORE( 7118): [CUDA-ERROR] Copying memory failed!

What does this specify?

Well if you actually add something like what I suggested, the error code 4 would be decoded into a nice human-readable error message.

Certainly I think it indicates that there is a problem with your code, most of which you haven’t shown. I don’t feel like looking up what error code 4 is. If you actually do “proper cuda error checking” then I won’t have to.

You can also get info on what error code 4 means by looking it up in the CUDA runtime API documentation:

http://docs.nvidia.com/cuda/index.html#axzz3hfHMbqrd

or looking it up in driver_types.h, which is in /usr/local/cuda/include on a standard linux install.

Once you discover what error code 4 means, you’ll need to start debugging your kernel code.

The method described in the answer here:

http://stackoverflow.com/questions/27277365/unspecified-launch-failure-on-memcpy

may help.