For those who might be interested,
Seems that there is a small bug in the bilateralFilter sample that comes with SDK. In filter_kernel.cu, in the function bilateralFilterRGBA(), the gridSize should be ((width + 16 - 1) / 16, (height + 16 - 1) / 16) instead of ((width + 16 - 1) / 16, (width + 16 - 1) / 16)
Thanks