PGCC-S-0094-Illegal type conversion required!

Hello,

I am trying to compile some code using PGI OpenAcc and I want to run the parallel loop of Tesla. I am getting the following errors. Any help is highly appreciated?

Thanks,

OS: RED HAT Enterprise Workstation
OpenACC Toolkit 2015

compilation command line:

pgc++ -I/usr/local/include/ -I/opt/Shark/include/  -acc -DNDEBUG -DBOOST_UBLAS_NDEBUG  -O3 -L/home/jooya/Shark/lib -lshark -lboost_serialization -lboost_system -lboost_filesystem -lboost_program_options NN_Shark.cpp -c

Errors:

PGCC-S-0094-Illegal type conversion required (/opt/Shark/include/shark/ObjectiveFunctions/AbstractConstraintHandler.h: 74)
PGCC-S-0094-Illegal type conversion required (/opt/Shark/include/shark/ObjectiveFunctions/AbstractConstraintHandler.h: 74)
PGCC-S-0094-Illegal type conversion required (/opt/Shark/include/shark/ObjectiveFunctions/AbstractConstraintHandler.h: 74)
PGCC-S-0094-Illegal type conversion required (/opt/Shark/include/shark/ObjectiveFunctions/AbstractConstraintHandler.h: 74)
PGCC-S-0094-Illegal type conversion required (/opt/Shark/include/shark/ObjectiveFunctions/AbstractConstraintHandler.h: 74)
PGCC-S-0094-Illegal type conversion required (/opt/Shark/include/shark/ObjectiveFunctions/AbstractConstraintHandler.h: 74)
PGCC-S-0000-Internal compiler error. BAD sptr in var_refsym       0 (/opt/Shark/include/shark/ObjectiveFunctions/AbstractConstraintHandler.h: 74)
PGCC-S-0000-Internal compiler error. BAD sptr in func_refsym       0 (/opt/Shark/include/shark/ObjectiveFunctions/AbstractConstraintHandler.h: 74)
PGCC-S-0094-Illegal type conversion required (/opt/Shark/include/shark/ObjectiveFunctions/AbstractConstraintHandler.h: 74)
PGCC-S-0094-Illegal type conversion required (/opt/Shark/include/shark/ObjectiveFunctions/AbstractConstraintHandler.h: 74)
PGCC-S-0000-Internal compiler error. BAD sptr in var_refsym       0 (/opt/Shark/include/shark/Core/Flags.h: 137)
PGCC-S-0000-Internal compiler error. BAD sptr in var_refsym       0 (/opt/Shark/include/shark/Core/Flags.h: 137)
PGCC/x86 Linux 15.7-0: compilation completed with severe errors
make: *** [NN_Shark.o] Error 2

Here is AbstractConstraintHandler.h:

//===========================================================================
/*!
 * 
 *
 * \brief       Base class for constraints.
 * 
 *
 * \author      O.Krause
 * \date        2013
 *
 *
 * \par Copyright 1995-2015 Shark Development Team
 * 
 * <BR><HR>
 * This file is part of Shark.
 * <http://image.diku.dk/shark/>
 * 
 * Shark is free software: you can redistribute it and/or modify
 * it under the terms of the GNU Lesser General Public License as published 
 * by the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 * 
 * Shark is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU Lesser General Public License for more details.
 * 
 * You should have received a copy of the GNU Lesser General Public License
 * along with Shark.  If not, see <http://www.gnu.org/licenses/>.
 *
 */
//===========================================================================
#ifndef SHARK_OBJECTIVEFUNCTIONS_ABSTRACTCONSTRAINTHANDLER_H
#define SHARK_OBJECTIVEFUNCTIONS_ABSTRACTCONSTRAINTHANDLER_H

#include <shark/Core/Exception.h>
#include <shark/Core/Flags.h>

namespace shark{


/// \brief Implements the base class for constraint handling.
///
/// A constraint handler provides information about the feasible region of a constrained optimization problem.
/// In the minimum it checks whether a point is feasible, or what the next fasible point would be.
template<class SearchPointType>
class AbstractConstraintHandler{
public:
	enum Feature {
		CAN_PROVIDE_CLOSEST_FEASIBLE     = 1,	///< The constraint handler can provide a close feasible point to an infeasible one
		IS_BOX_CONSTRAINED = 2,  ///< The constraint handler is an instance of BoxConstraintHandler
		CAN_GENERATE_RANDOM_POINT = 4  ///< The ConstraintHandler can generate a random point inside the feasible region
	};
	SHARK_FEATURE_INTERFACE;
	
	virtual ~AbstractConstraintHandler(){}
		
	/// \brief Returns whether this function can calculate the closest feasible to an infeasible point.
	bool canProvideClosestFeasible()const{
		return m_features & CAN_PROVIDE_CLOSEST_FEASIBLE;
	}
	
	/// \brief Returns whether this function is an instance of BoxConstraintHandler
	bool isBoxConstrained()const{
		return m_features &IS_BOX_CONSTRAINED;
	}
	/// \brief Returns whether this function is an instance of BoxConstraintHandler
	bool canGenerateRandomPoint()const{
		return m_features & CAN_GENERATE_RANDOM_POINT;
	}
	
	/// \brief If supported, generates a random point inside the feasible region.
	virtual void generateRandomPoint( SearchPointType & startingPoint )const {
		SHARK_FEATURE_EXCEPTION(CAN_GENERATE_RANDOM_POINT);
	}
	
	/// \brief Returns true if the point is in the feasible Region.
	///
	/// This function must be implemented by a ConstraintHandler
	virtual bool isFeasible(SearchPointType const&)const = 0;
	virtual void closestFeasible(SearchPointType& )const{
		SHARK_FEATURE_EXCEPTION(CAN_PROVIDE_CLOSEST_FEASIBLE );
	}
	
};
}
#endif

and Flags.h:

//===========================================================================
/*!
 * 
 *
 * \brief       Flexible and extensible mechanisms for holding flags.
 * 
 * 
 *
 * \author      T.Voss
 * \date        2010-2011
 *
 *
 * \par Copyright 1995-2015 Shark Development Team
 * 
 * <BR><HR>
 * This file is part of Shark.
 * <http://image.diku.dk/shark/>
 * 
 * Shark is free software: you can redistribute it and/or modify
 * it under the terms of the GNU Lesser General Public License as published 
 * by the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 * 
 * Shark is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU Lesser General Public License for more details.
 * 
 * You should have received a copy of the GNU Lesser General Public License
 * along with Shark.  If not, see <http://www.gnu.org/licenses/>.
 *
 */
//===========================================================================
#ifndef SHARK_CORE_FLAGS_H
#define SHARK_CORE_FLAGS_H

#include <shark/Core/Exception.h>
#include <shark/Core/ISerializable.h>

namespace shark {


///
/// \brief Flexible and extensible mechanisms for holding flags.
///
/// \par
/// The world's most airbrushed integer ever...
///
/// \par
/// This class encapsulates a flexible mechanism for holding flags.
/// Its templatization makes it possible to base it on any base
/// type, while unsigned int will be the most common choice. This
/// mechanism makes it possible, in principle, to support as many
/// flags as needed. Furthermore, classes may extend the flags
/// defined by their superclass.
///
template<typename Flag>
class TypedFlags : public ISerializable {
public:
	TypedFlags() : m_flags( 0 ) { }
	TypedFlags(TypedFlags const& other) : m_flags(other.m_flags) { }

        virtual ~TypedFlags() {}

	inline TypedFlags<Flag> & operator = ( TypedFlags<Flag> const& rhs ) {
		m_flags = rhs.m_flags;
		return( *this );
	}

	inline void set( Flag f ) {
		m_flags |= f;
	}

	inline void setAll() {
		m_flags |= ~0;
	}

	inline void reset() {
		m_flags = 0;
	}

	inline void reset( Flag f ) {
		m_flags &= ~f;
	}

	inline bool test( Flag f ) const {
		return ( m_flags & f) == (unsigned int)f;
	}

	inline bool operator&( Flag f ) const {
		return ( m_flags & f) == (unsigned int)f;
	}

	inline TypedFlags<Flag> & operator|=( Flag f ) {
		m_flags |= f;
		return( *this );
	}
	
	inline TypedFlags<Flag> & operator|=(const TypedFlags<Flag>& flags ) {
		m_flags |= flags.m_flags;
		return( *this );
	}

	inline TypedFlags<Flag> operator|( Flag f ) const {
		TypedFlags<Flag> copy( *this );
		copy |= f;
		return( copy );
	}
	inline TypedFlags<Flag> operator|(const TypedFlags<Flag>& flags ) const {
		TypedFlags<Flag> copy( *this );
		copy |= flags;
		return copy;
	}

	virtual void read( InArchive & archive )
	{ archive & m_flags; }

	virtual void write( OutArchive & archive ) const
	{ archive & m_flags; }

protected:
	unsigned int m_flags;
};


///
/// \brief Exception indicating the attempt to use a feature which is not supported
///
template<class Feature>
class TypedFeatureNotAvailableException : public Exception {
public:
	TypedFeatureNotAvailableException( Feature feature, const std::string & file = std::string(), unsigned int line = 0 )
	: Exception( "Feature not available", file, line ),
	m_feature( feature ) {}
	TypedFeatureNotAvailableException( const std::string & message, Feature feature, const std::string & file = std::string(), unsigned int line = 0 )
	: Exception( message, file, line ),
	m_feature( feature ) {}

	Feature feature() const {
		return m_feature ;
	}
protected:
	Feature m_feature;
};

}

namespace boost {
namespace serialization {

template< typename T >
struct tracking_level< shark::TypedFlags<T> > {
    typedef mpl::integral_c_tag tag;
    BOOST_STATIC_CONSTANT( int, value = track_always );
};

}
}

#define SHARK_FEATURE_INTERFACE \
typedef TypedFlags<Feature> Features;\
protected:\
Features m_features;\
public:\
const Features & features() const {\
	return( m_features );\
}\
virtual void updateFeatures(){}\
typedef TypedFeatureNotAvailableException<Feature> FeatureNotAvailableException

/// Throws an Exception when called.
/// This macro should be used in default implementations of the interface.
/// This define also checks first whether the feature is set to true inside the class.
/// If this is the case then we have encountered a programming mistake - so we assert instead.
#define SHARK_FEATURE_EXCEPTION(FEATURE) \
{assert(!(this->features()&FEATURE));\
throw FeatureNotAvailableException("Class does not support Feature " #FEATURE, FEATURE,__FILE__, __LINE__);}
/// Same as SHARK_FEATURE_EXCEPTION, but used when called from a derived class.
/// Assumes that a typedef "base_type" for the Baseclass exists
#define SHARK_FEATURE_EXCEPTION_DERIVED(FEATURE) \
{assert(!(this->features()&base_type::FEATURE));\
throw typename base_type::FeatureNotAvailableException("Class does not support Feature " #FEATURE, base_type::FEATURE,__FILE__, __LINE__);}

/// Checks whether the feature is available, if not, it throws an exception.
#define SHARK_FEATURE_CHECK(FEATURE)\
if(!(this->features()&base_type::FEATURE)){SHARK_FEATURE_EXCEPTION_DERIVED(FEATURE);}
#endif // SHARK_CORE_FLAGS_H

Hi Ali,

Unfortunately there’s not enough information to tell what’s causing this error, though it’s most likely a compiler error. I’ve only seen one other similar error but that was fixed back in the 14.4 release so this is different.

I’ve downloaded Shark and am in the process of building it. Can you either post or send to PGI Customer Service (trs@pgroup.com) your “NN_Shark.cpp” file so I can try and recreate the error here?

Thanks,
Mat

I just emailed the file to the PGI customer service. Thank you for your time.

Thanks Ali.

The “Illegal type conversion” error is coming from the “throw”. Exception handling is not supported in device code so the compiler gives this, albeit unhelpful, error. I was able to work around it by setting these macros to nothing in Flags.h

#ifdef _OPENACC
#define SHARK_FEATURE_EXCEPTION(FEATURE)
#define SHARK_FEATURE_EXCEPTION_DERIVED(FEATURE)
#define SHARK_FEATURE_CHECK(FEATURE)
#else

Beyond that, we don’t support “long double” in device code, so I declared Max_Error to be a “double”. Also we have a known issue with unsigned int being used as a loop bounds variable, so I changed “iterations” from unsigned to int. Now to the port.

First, the “data” and “test” variables are container classes and as far as I can tell, eventually get turned in STL::vector. With aggregate data types with dynamic data members, the compiler can’t automatically copy them to the device. It can do a shallow copy (i.e. just the pointers), but without any information about the shape or size of the data structure, can’t duplicate the structure on the device.

The OpenACC committee is looking at way to help solve this problem, but it’s not likely until the OpenACC 3.0 standard. My talk at GTC2015 (https://www.youtube.com/watch?v=rWLmZt_u5u4) discusses OpenACC C++ Class Data Management which hopefully helps illustrate the issue.

The current solution for this is to use CUDA Unified Memory (-ta=tesla:managed) so you don’t need to manage dynamic data yourself and instead let the CUDA runtime do it for you. The current UM is experimental and does have some significant caveats. Most notable is that it’s only available for dynamic data and your program is limited to the amount of memory on the device.

I tried compiling your program with “-ta=tesla:managed” but hit a problem with Boost. Boost has a call to “std::free”, but in order support UM, we rename “free” to “free_managed”. Given there isn’t a “std::free_managed”, you get a syntax error. The simple work-around is change “std::free” to “free” so the C version is picked-up.

The next step will be to go through all the Shark routines that are called from the device and add an OpenACC “routine seq” directive to them. The compiler can automatically create device routines for the ones it has visibility for (such as a template) but not those in code compiled in a different module.

Hope this helps,
Mat

% pgc++ -I/local/home/colgrove/Shark/include -I/local/home/colgrove/Shark/build/include -I/home/colgrove/Boost/boost_1_58_0/ -L/home/colgrove/Boost/boost_1_58_0/ -L/local/home/colgrove/Shark/build/lib NN_Shark.cpp -lshark -DBOOST_NO_ALIGNMENT -w -c -acc -Minfo=accel -ta=tesla:managed -V15.7
operator new (unsigned long, void *):
      1, include "DataDistribution.h"
          40, include "Dataset.h"
               48, include "foreach.hpp"
                    73, include "end.hpp"
                         25, include "iterator.hpp"
                              20, include "mutable_iterator.hpp"
                                   23, include "iterator_traits.hpp"
                                         8, include "iterator.hpp"
                                             11, include "iterator"
                                                  65, include "ostream"
                                                       40, include "ios"
                                                            43, include "ios_base.h"
                                                                 43, include "locale_classes.h"
                                                                      42, include "string"
                                                                           43, include "allocator.h"
                                                                                48, include "c++allocator.h"
                                                                                     34, include "new_allocator.h"
                                                                                          33, include "new"
                                                                                              101, Generating implicit acc routine seq
operator delete (void *, void *):
      1, include "DataDistribution.h"
          40, include "Dataset.h"
               48, include "foreach.hpp"
                    73, include "end.hpp"
                         25, include "iterator.hpp"
                              20, include "mutable_iterator.hpp"
                                   23, include "iterator_traits.hpp"
                                         8, include "iterator.hpp"
                                             11, include "iterator"
                                                  65, include "ostream"
                                                       40, include "ios"
                                                            43, include "ios_base.h"
                                                                 43, include "locale_classes.h"
                                                                      42, include "string"
                                                                           43, include "allocator.h"
                                                                                48, include "c++allocator.h"
                                                                                     34, include "new_allocator.h"
                                                                                          33, include "new"
                                                                                              105, Generating implicit acc routine seq
PGCC-S-0155-Procedures called in a compute region must have acc routine information: shark::IRpropPlus::step(const shark::AbstractObjectiveFunction<shark::blas::vector<double>, double> &) (NN_Shark.cpp: 99)
PGCC-S-0155-Accelerator region ignored; see -Minfo messages  (NN_Shark.cpp: 75)
main:
     75, Accelerator region ignored
     99, Accelerator restriction: call to 'shark::IRpropPlus::step(const shark::AbstractObjectiveFunction<shark::blas::vector<double>, double> &)' with no acc routine information
PGCC/x86 Linux 15.7-0: compilation completed with severe errors

Thanks Mat. Few more questions:

1- I am using NVIDIA OpenAcc Toolkit. Does it support -ta=tesla:managed option?

2- I am trying to run only a small section of the code on GPU:

for(int step = 0; step != numberOfSteps; ++step) 
			optimizer.step(errorFunction);

This single line of the code involves a lot of function calls (Shark and boost) objects and complicated large classes with many virtual and friend functions. I was wondering what is the best way to deal with this. I mean, does -ta=tesla:managed take care of copying the objects into device memory and back into main memory? Or do I need to modify classes and add OpenAcc pragmas as described in https://www.pgroup.com/lit/articles/insider/v6n2a1.htm?

I appreciate your help.

1- I am using NVIDIA OpenAcc Toolkit. Does it support -ta=tesla:managed option?

It should. The toolkit compilers are same as the full version with only a few limitations on the host side.

I mean, does -ta=tesla:managed take care of copying the objects into device memory and back into main memory?

Yes for all dynamic data (i.e. data allocated via new or malloc). Static data still need manually managed.

For example if you declare a class, “myclass A”, then you will need to put “A” in an OpenACC data region. However, any dynamically allocated data members in “A” will be managed. If you dynamically allocate “A”, “myclass * A=new A”, then all of “A” will be managed.

“Managed” only takes care of the data movement. You will still need to use the “routine” directive to tell the compiler to create device callable methods/routines. If the method’s definition is visible to the compiler (such as a template) when it’s used in an OpenACC compute region or another device routine, then the compiler will attempt to implicitly create a device routine. If the definition is not visible, then you will need to manually add “#pragma acc routine seq” to the method’s definition and prototype. This includes all routines and methods down the call tree.

  • Mat