CUDA Error Reporting Why are we stuck with such bad error messages?

The OpenGL API has been around for many years now. So, after many a frustrating error message like “invalid argument”, why do we still see these same “informative” messages from CUDA?

It’s logical to ask what argument was “invalid” and why? We just know that deep in the API code there’s some “if” statement checking an argument against its legal values. Is it too much ask that the argument name be reported?

Since the error system is asynchronous, why not report the name of the function called? (See macros below.) Every little hint helps dear API writers.

In general, if you are going to report an error condition, report the predicate and its arguments. (who, what, where, when)

After gaining the advantages of exception handling in C++, why do we have to devolve back to some error prone API with every new hardware technology? Having to use SAFE_CALL macros really looks pretty doesn’t it? Yes, I understand “C” as an LCD, but it’s costing me time and frustration. OpenCL looks even worse.

[codebox]

#ifdef GNUC

#define EXCEPTION_SCOPE PRETTY_FUNCTION

#endif

#if _WIN32

#if _MSC_VER >= 1300

#define EXCEPTION_SCOPE FUNCSIG

#endif

#endif

#if !defined(EXCEPTION_SCOPE)

#define EXCEPTION_SCOPE “unknown”

#endif

[/codebox]

I agree some of the error reporting in CUDA is a bit vague.

If you have specific suggestions to improve this, please file bugs.

Wouldn’t it be better to apply a set standards for improving error reporting across the code base than to wait for customers to report issues one function at a time?

So, one rule would be that for every statement of the form (roughly)

if( (arg) )

{

return failure_enum;

}

should be:

if( (arg) )

{

msg += “”;

msg += “arg”;

report_error( msg );

return failure_id;

}

e.g.

if( 0 == ptr )

{

std::string msg = "nil pointer for arg: ptr in function: "; // replace with 'C' string stuff

msg += FUNCTION_SCOPE;

report_error( msg );

return cudaBadArgument;

}

Best I can tell, the bug reporting is done via a forum.

I agree this would be useful, although I guess all those checks would have some performance overhead.

If you become a registered developer you can file bug reports:

http://developer.nvidia.com/page/registere…er_program.html

That’s what a debug flag is for I suppose :)

Yeah, I’d gladly link to a ‘DEBUG’ build of the CUDA library until I got the code working.