kernels doesn't work

Recently, I am programming a CUDA program on Jetson TX2&Ubuntu 16.04 .A main program with “.cpp” , and call function “.cu”.For example, using “Test.cpp” call “” using “extern “c” int simple_printf();”.
My program shortly shows as fllowing:

__global__ void simple_printf()
    printf("Get into kernel successful");

extern "C" int test()
int *d_image = NULL;
CHECK(cudaMalloc((void**)&d_image, data_size));
CHECK(cudaMemcpy(d_image, data, data_size, cudaMemcpyHostToDevice));


The problems, If i use the root priority “sudo ./Test”, the program goes, but the kernel will not execute(cannot see the output), if i use “./Test”, the err occurs on code following:“err code 30, Unknown error”

CHECK(cudaMalloc((void**)&d_image, data_size));

The key problem is , the kernel(simple_printf) cannot launch…

It troubles me a long time ,I comes to see whether you know how to solve the program

Add cudaDeviceSynchronize() after the kernel call.

To saulocpp:
Thanks a lot , it’s work well now.I really need to learn how to use synchronize().