curand_init() within optix

RT_PROGRAM void pinpole_camera(){
  curandState localstate;
  curand_init(1234,launch_index,0,&localstate);
  float3 origin = ...;
  float3 direction = ...;
  optix::Ray ray = ....;
  printf(".....");
  rtTrace(top_object,ray,prd);
...
}

The weird thing is that if I commented out curand_init(), the program compiles and runs well but with this line, it compiles well and runs until printf("…") normally and reports error:

OptiX Error: Unknown error (Details: Function “RTresult _rtContextLaunch1D(RTcontext, unsigned int, RTsize)” caught exception: Encountered a CUDA error: result returned (700): Unknown, [6619204])
(sample2.cxx:68)

sample2.cxx:68 is

RT_CHECK_ERROR( rtContextLaunch1D( context, 0, width ) );

Any idea?
Thanks!

One thing to note is that curand_init might be using up a lot of your stack space. Try this and see if it works any better:

curand_init(launch_index,0,0,&localstate);

Thanks, it works now.