Difference in error handling between driver api and runtime api

so the code in How to clear cuda errors? - #3 by njuffa is problematic actually, although it can allocate memory successfully, the global error state is polluted. We need to call cudaGetLastError to clear the error for it to be useful.