Strange Compilations Problems

SchnellCoder · January 26, 2019, 3:02am

I recently put together a computer at home and installed Kubuntu. I followed the instructions and installed the CUDA Toolkit. I was able to compile the samples and run the one called out in the install and all seemed well.

Previously, my CUDA programming experience was limited to Microsoft Visual Studio on Windows. Having switched to Linux at home I began using Nsight Eclipse Edition version 10.0 and I’ve noticed peculiarities in that I cannot seem to use cudadError_t as a return value in template class methods. I couldn’t use cudaEvent_t as an input argument in a class method. If I make a function in a .cu file that returns cudaError_t that seems to be fine. There simply seems to be a problem using these in classes. I’ll provide examples below:

CudaTimer.h

#ifndef CUDA_TIMER_H
	#define CUDA_TIMER_H

	#include <cuda_runtime.h>

	class CudaTimer
	{
		//------------------------------------------------------------------------------------------
		//  Attributes
		//------------------------------------------------------------------------------------------
		private:
			bool m_timeSampled;
			bool m_timerStarted;
			cudaError_t m_lastError;
			cudaEvent_t * m_pStartEvent;
			cudaEvent_t * m_pStopEvent;
			float m_timeInMilliseconds;

		//------------------------------------------------------------------------------------------
		//  Constructor
		//------------------------------------------------------------------------------------------
		public:
			CudaTimer();

		//------------------------------------------------------------------------------------------
		//  Destructor
		//------------------------------------------------------------------------------------------
		public:
			~CudaTimer();

		//------------------------------------------------------------------------------------------
		//  Operations
		//------------------------------------------------------------------------------------------
		private:
			bool CreateEvent(cudaEvent_t ** ppCudaEvent);

		public:
			float GetTime(float & timeInMilliseconds);
			bool StartTimer();
			bool StopTimer();
	};

#endif

CudaTimer.cpp

bool CudaTimer::CreateEvent(cudaEvent_t ** ppCudaEvent)
{
	if (nullptr == ppCudaEvent)
		return false;

	return true;
}

The result is “Member declaration not found” regarding CudaTimer::CreateEvent. As a test, if I change the input argument to void or some C primitive type, there is no error.

CudaAdd.cuh

#ifndef CUDA_ADD_H_
	#define CUDA_ADD_H_

	#include <driver_types.h>

	#include "CudaBuffer.h"

	//----------------------------------------------------------------------------------------------
	//  The following declarations are for the CUDA host wrappers to the device calls.
	//----------------------------------------------------------------------------------------------
	template <typename OPERAND1, typename OPERAND2, typename RESULT>
	bool CPU_AddCast(OPERAND1 * pOperand1,
	                 OPERAND2 * pOperand2,
	                 RESULT * pResult,
	                 int elements,
	                 int threads);
	template <typename OPERAND1, typename OPERAND2, typename RESULT>
	bool CPU_CastAdd(OPERAND1 * pOperand1,
	                 OPERAND2 * pOperand2,
	                 RESULT * pResult,
	                 int elements,
	                 int threads);

	template <typename OPERAND1, typename OPERAND2, typename RESULT>
	class CudaAdd
	{
		//------------------------------------------------------------------------------------------
		//  Operations
		//------------------------------------------------------------------------------------------
		public:
			static cudaError_t AddCast(OPERAND1 * pOperand1,
			                           OPERAND2 * pOperand2,
			                           RESULT * pResult,
			                           size_t elements,
			                           int threads);
			static cudaError_t CastAdd(OPERAND1 * pOperand1,
			                           OPERAND2 * pOperand2,
			                           RESULT * pResult,
			                           size_t elements,
			                           int thrads);
	};

	cudaError_t DoSomething()
	{
		return cudaSuccess;
	}

	//----------------------------------------------------------------------------------------------
	//  Purpose:  This method adds the first n elements (specified by the value of elements) of
	//            pOperand1 and pOperand2 and stores the results into pResult.  Each element of
	//            pOperand1 is added to each element of pOperand2, the result is then casted to type
	//            RESULT and stored into pResult.
	//
	//  Inputs:   pOperand1 - specifies values to use for the first operand of the addition
	//                        operation.
	//            pOperand2 - specifies values to use for the second operand of the addition
	//                        operation.
	//            elements - specifies the first n elements to process in pOperand1, pOperand2 and
	//                       pResult.
	//            threads - specifies the number of threads per block to request when performing the
	//                      addition operation.
	//
	//  Outputs:  pResult - is updated with the results of adding each element of pOperand1 and
	//                      pOperand2.
	//
	//  Returns:  cudaError_t - specifies the success of the operation.
	//                  cudaSuccess - if the operation completed successfully.
	//                  !cudaSuccess - if the operation failed.
	//
	//  Author:   Mr. X
	//
	//  Created:  January 24, 2019
	//----------------------------------------------------------------------------------------------
	template <typename OPERAND1, typename OPERAND2, typename RESULT>
	cudaError_t CudaAdd<OPERAND1, OPERAND2, RESULT>::AddCast(OPERAND1 * pOperand1,
	                                                         OPERAND2 * pOperand2,
	                                                         RESULT * pResult,
	                                                         size_t elements,
	                                                         int threads)
	{
		//------------------------------------------------------------------------------------------
		//  Perform the addition and cast operation on the given buffers.
		//------------------------------------------------------------------------------------------
		CPU_AddCast<OPERAND1, OPERAND2, RESULT>(pOperand1,
		                                        pOperand2,
		                                        pResult,
		                                        elements,
		                                        threads);

		//------------------------------------------------------------------------------------------
		//  Wait for the addition and cast operation to finish.
		//------------------------------------------------------------------------------------------
		cudaDeviceSynchronize();

		//------------------------------------------------------------------------------------------
		//  Check for an error and return the error; will be cudaSuccess of there were no errors.
		//------------------------------------------------------------------------------------------
		return cudaGetLastError();
	}

	//----------------------------------------------------------------------------------------------
	//  Purpose:  This method casts the first n elements (specified by the value of elements) of
	//            pOperand1 and pOperand2 to type RESULT, then adds their values together and stores
	//            the results into pResult.
	//
	//  Inputs:   pOperand1 - specifies values to use for the first operand of the addition
	//                        operation.
	//            pOperand2 - specifies values to use for the second operand of the addition
	//                        operation.
	//            elements - specifies the first n elements to process in pOperand1, pOperand2 and
	//                       pResult.
	//            threads - specifies the number of threads per block to request when performing the
	//                      addition operation.
	//
	//  Outputs:  pResult - is updated with the results of adding each element of pOperand1 and
	//                      pOperand2.
	//
	//  Returns:  cudaError_t - specifies the success of the operation.
	//                  cudaSuccess - if the operation completed successfully.
	//                  !cudaSuccess - if the operation failed.
	//
	//  Author:   Mr. X
	//
	//  Created:  January 24, 2019
	//----------------------------------------------------------------------------------------------
	template <typename OPERAND1, typename OPERAND2, typename RESULT>
	cudaError_t CudaAdd<OPERAND1, OPERAND2, RESULT>::CastAdd(OPERAND1 * pOperand1,
	                                                         OPERAND2 * pOperand2,
	                                                         RESULT * pResult,
	                                                         size_t elements,
	                                                         int threads)
	{
		//------------------------------------------------------------------------------------------
		//  Perform the addition and cast operation on the given buffers.
		//------------------------------------------------------------------------------------------
		CPU_CastAdd<OPERAND1, OPERAND2, RESULT>(pOperand1,
					                            pOperand2,
					                            pResult,
					                            elements,
					                            threads);

		//------------------------------------------------------------------------------------------
		//  Wait for the addition and cast operation to finish.
		//------------------------------------------------------------------------------------------
		cudaDeviceSynchronize();

		//------------------------------------------------------------------------------------------
		//  Check for an error and return the error; will be cudaSuccess of there were no errors.
		//------------------------------------------------------------------------------------------
		return cudaGetLastError();
	}

#endif

This also results in “Member declaration not found.” If I substitude the return of cudaError_t for int, the problem goes away. It isn’t that I can’t use CUDA primitives, it’s simply I can’t seem to declare their use in an h or cuh file and define their use in the accompanying .cpp file.

I’ve tried .h, .cuh and .cpp

It is as though the compiler is only aware of the CUDA primitives in special occasions.

Thanks for any help.

kjalaludeen · March 26, 2019, 10:39pm

Hi SchnellCoder,

Can you try changing the file types from .h/.cpp to .cuh/.cu types?

Topic		Replies	Views
Nsight CUDA files unable to index (Member declaration not found) Nsight Eclipse Edition	2	1202	April 17, 2022
Cannot get CUDA classes to index Nsight Eclipse Edition	0	884	October 1, 2019
problem with class template CUDA Programming and Performance	3	1079	March 12, 2013
cudaError_t: undeclared identifier Where do I find it? CUDA Programming and Performance	1	2233	July 29, 2010
Problem with using class template within CUDA CUDA Programming and Performance	0	1257	February 28, 2013
CUDA and C++ ? Namespaces, classes, templates :( CUDA Programming and Performance	6	13115	April 29, 2009
couldn't compile default application Nsight Eclipse Edition	1	1911	January 3, 2014
Maya + Cuda 11 type redeclaration error CUDA Developer Tools	2	627	October 12, 2021
Template in cuda CUDA Developer Tools	0	315	November 17, 2020
Old topic renewd: Is there a way to use CDT for CUDA? Trying to figure out if there's a nice way CUDA Programming and Performance	12	5209	March 16, 2011

Strange Compilations Problems

Related topics