Checking for type of thrust::system_error

Like most people, I have a wrapper function for checking for CUDA errors:

#include <sstream>
#include <thrust/system/cuda/error.h>
#include <thrust/system_error.h>

#define cudaCheck(code) throw_on_cuda_error((code), __FILE__, __LINE__)

inline void
throw_on_cuda_error(cudaError_t code, const char* file, int line)
{
    if ( code != cudaSuccess )
    {
        std::stringstream ss;
        ss << file << "(" << line << ")";
        throw thrust::system_error(code, thrust::cuda_category(), ss.str());
    }
}

I have a gtest unit test for some code that should error:

    EXPECT_THROW(
            {
                auto constexpr capacity = 0U;
                auto buf                = PinnedHostBuffer<char>(capacity);
            },
            thrust::system_error);

This test fails. The actual exception thrown is thrust::THRUST_200500___CUDA_ARCH_LIST___NS::system::system_error

It doesn’t bother me that the types are different (I’ve dug into the headers and kind of follow the namespace philosophy), but I don’t understand why, if I throw thrust::system_error I can’t catch thrust::system_error

Am I doing something foolish? Is there a better way?

The general mechanism seems to work fine for me, without google test framework (and without whatever PinnedHostBuffer<> is). Here is an example on CUDA 12.2:

# cat t310.cu
#include <thrust/system_error.h>
#include <thrust/system/cuda/error.h>
#include <sstream>

void throw_on_cuda_error(cudaError_t code, const char *file, int line)
{
  if(code != cudaSuccess)
  {
    std::stringstream ss;
    ss << file << "(" << line << ")";
    std::string file_and_line;
    ss >> file_and_line;
    throw thrust::system_error(code, thrust::cuda_category(), file_and_line);
  }
}

#include <iostream>

int main()
{
  try
  {
    // do something crazy
    throw_on_cuda_error(cudaSetDevice(-1), __FILE__, __LINE__);
  }
  catch(thrust::system_error &e)
  {
    std::cerr << "CUDA error after cudaSetDevice: " << e.what() << std::endl;

    // oops, recover
    cudaSetDevice(0);
  }

  return 0;
}
# nvcc -o t310 t310.cu
# ./t310
CUDA error after cudaSetDevice: t310.cu(24): cudaErrorInvalidDevice: invalid device ordinal
#

I wouldn’t be able to help with gtest, if that is a factor in this.

You mentioned 12.2; This test was working in earlier versions of CUDA. I recently revved my tools to 12.6 and this test started failing. I should have specified that in the original message.

Taking gtest out of the equation, I put

try {
    // something that throws `thrust::system_error`
}
catch (thrust::system_error const& e)
{
    std::cout << "caught thrust exception: " << e.what() << std::endl;
}
catch(...)
{
    std::cout << "caught something else" << std::endl;
}

the “something else” message printed.

well, bother.

TEST(thrust, system_error)
{
    try
    {
        throw thrust::system_error(1, thrust::cuda_category(), "a thrust error");

    } catch ( thrust::system_error const& e )
    {
        std::cout << "caught thrust exception: " << e.what() << std::endl;
    } catch ( ... )
    {
        std::cout << "caught something else" << std::endl;
    }
}

actually catches the thrust::system_error. But in the body of the EXPECT_THROW method, the same logic catches the ... case.

Thanks for looking.

Playing with this further, I find that the issue is (somehow) related to compilation unit scope.

THIS fails:

    try
    {
        cudaCheck(cudaError_t(1));
    } catch ( thrust::system_error const& e )
    {
        std::cout << "caught thrust exception: " << e.what() << std::endl;
    } catch ( ... )
    {
        std::cout << "caught something else" << std::endl;
    }

if the cudaCheck() macro and function are defined in the current compilation unit, it works as desired. However, I have a library that uses the header in multiple places.

Possibly related to [BUG]: CUDA Thrust object shared between .cu and .cpp files · Issue #2737 · NVIDIA/cccl · GitHub

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.