Calling C function in Device code.

Binayak · March 4, 2015, 9:58am

Is there any way to call a simple c function in global function?? If someone have some idea could please help me out. Otherwise I have change a lot …

little_jimmy · March 4, 2015, 2:06pm

“Is there any way to call a simple c function in global function??”

it really depends on what you mean by ‘simple c function’

you can call a device function from within a global function, but ‘terms and conditions’ apply
it would be a device function, within the context of the calling kernel, etc

what is it that you want/ need to do?

Binayak · March 16, 2015, 7:57am

Thank you…

I have a header file for Simplex. So Is there any way out to call the simplex constructor in device function directly or all I have to change all related functions in the header file itself to device function…???

My main aim is to run two or more simplex in parallel…!!

AlexIanevski · March 16, 2015, 8:41am

If you have no qualifiers (or host, which is the same) , it’s like a normal method and you can call it from the host side.
If you qualify your constructor with device, you are be able to call it from device side.
If you qualify your constructor with both qualifiers, you’ll be able to call it on both sides.

little_jimmy · March 16, 2015, 11:18am

“I have a header file for Simplex.”

"My main aim is to run two or more simplex in parallel…!! "

to be clear, by simplex you mean what exactly?

are you referring to the simplex algorithm/ method as part of LP?

Binayak · March 16, 2015, 11:55am

Yes I am referring Simplex algorithm as a whole. I want make an parallel LPP solver library using simplex method. So for that I have a simplex.h file.

Binayak · March 16, 2015, 11:58am

Thank you AlexIanevski

If you have no qualifiers (or host, which is the same) , it’s like a normal method and you can call it from the host side.
If you qualify your constructor with device, you are be able to call it from device side.
If you qualify your constructor with both qualifiers, you’ll be able to call it on both sides.

Yes …If I want to qualify the constructor with device, then for adding “device” this line to my header file, should I also be need to change the .h file to .cuh or what else I need to do??

little_jimmy · March 16, 2015, 12:30pm

i do not see how you would be able to simply take the simplex header file, and change a few qualifiers, in order to end up with a parallel platform implementation of what i take to be largely a serial platform implementation

you would lose out on a lot of performance gain that way too, i would think

noting 3 typical parallel classifications:
a) embarrassingly parallel; b) coarse grained; c) fine grained

you could first of all have thread blocks of a kernel jointly calculate and work on individual steps of (an iteration of) the simplex algorithm - in general, the next vertex to step to

thereafter, you could implement a next tier of parallelization, by implementing a number of (similar/ identical) kernels, each with a different starting point - in essence, both implementing a grain parallelization as well as an embarrassingly parallel implementation
the parallel kernels can signal via global memory when they have found the optimal solution, such that the kernels know when to terminate
this way, you may arrive at the global solution quicker
if the simplex algorithm executes in polynomial time, n parallel executions thereof may finish execution in (polynomial time / n) under ideal conditions, also provided that you have spare processing power (my take is that on a gpu workhorse, you would)

Binayak · March 16, 2015, 1:07pm

Ok…

Can you then in this regard help me how to use ublas library for matrices and vectors???

little_jimmy · March 16, 2015, 1:26pm

[b]Maybe i am missing what you are saying

you were referring to a header file, and mentioned making a parallel lp solver library, so i took it that you have a host simplex algorithm implementation as reference

i would also argue that the simplex algorithm includes matrices and vectors, but is not confined to mere matrices and vectors

adding additional layers of parallelization would of course require a deep(er) understanding of the algortihm as prerequisite
[/b]

little_jimmy · March 16, 2015, 3:16pm

could you affirm the current simplex ‘source’/ reference you are working from, such that it is easier to follow exactly where you are in terms of your implementation
i am currently under the impression that you are working from a host simplex implementation; could you confirm or refute this

if that is the case, representing the lp feasible region in matrix form is well documented
also, moving such a representation to standard form is also well documented
selecting the next pivot as part of the simplex algorithm is also documented, generally consists of matrix operations, and may be something you can utilize the said libraries for
the libraries are well documented

as a more advanced level, to move to multiple parallel implementations, you need duplicate the feasible region standard matrix representations, and need to select - or force - different starting pivots per implementation, to get different starting points
and you would likely need to use streams to implement such a design

Binayak · March 17, 2015, 4:40am

The following is my simplex.hpp

//#include “simplex.h”
#include
#include
#include
#include “math/matrix.h”

/*
bool maybe_equal(double x, double y){
double tol = 1e6; // precision op to 6 decimal places
double xt= 1= xtol;
double yt = ytol;
} */

template
simplex::simplex() {
max_iters=50;
dim = 0;
}
template
simplex::~simplex() {
//dtor
}
template
void simplex::pivot(int e, int lv) {
std::cout << “pivot called with entering var =” << e << "leaving var = "
<< lv << std::endl;
// Compute the coefficeints of the equation for new basic variables x_e
math::matrix Anew(As.size1(), As.size2());
std::vector bnew(bs.size(), 0);
//std::cout << "As(lv,e) = " << As(lv,e) << std::endl;

assert(As(lv, e) != 0);
bnew[e] = bs[lv] / As(lv, e);
std::cout << " bs[lv] = " << bs[lv] << " As[lv,e] = " << As(lv, e)
		<< "bnew[e] = " << bnew[e] << std::endl;

typename std::set<int>::iterator it, it1;
for (it = N.begin(); it != N.end(); it++) {
	if (*it == e)
		continue;
	int j = *it;
	Anew(e, j) = As(lv, j) / As(lv, e);
	//	std::cout<<" Anew(e, j) = " << Anew(e, j) <<std::endl;
}
Anew(e, lv) = 1 / As(lv, e);

// Compute the coefficients of the remaining constraints
for (it = B.begin(); it != B.end(); it++) {
	int i = *it;
	if (i == lv) {
		for (int entire_row = 0; entire_row <= B.size() + N.size();
				entire_row++)	// Amit
			Anew(lv, entire_row) = 0;//Amit entire row for the leaving variable coefficient set to 0
		continue;
	}
	bnew[i] = bs[i] - As(i, e) * bnew[e];
	for (it1 = N.begin(); it1 != N.end(); it1++) {
		int j = *it1;
		if (j == e) {
			Anew(i, e) = 0;	//AMIT for the same entering variable, coefficient is set to 0
			continue;
		}
		Anew(i, j) = As(i, j) - As(i, e) * Anew(e, j);
		std::cout << " Anew(i, j) = " << Anew(i, j) << std::endl;
	}
	Anew(i, lv) = -1 * As(i, e) * Anew(e, lv);
	Anew(i, i) = 0;	//AMIT for the same basic variable, coefficient is set to 0
	std::cout << " Anew(i, lv) AMIT = " << Anew(i, lv) << std::endl;
}

std::cout << "state of the constraints matrix Amit Check here\n";
for (int i = 1; i < Anew.size1(); i++) {
	for (int j = 1; j < Anew.size2(); j++)
		std::cout << Anew(i, j) << " ";
	std::cout << std::endl;
}

// Compute the objective function
obj_val = obj_val + cs[e] * bnew[e];
//std::vector<T> cnew(dim + 1, 0);
std::vector<T> cnew(cs.size(), 0);	//Code tried by Amit
for (it1 = N.begin(); it1 != N.end(); it1++) {
	int j = *it1;
	if (j == e)
		continue;
	/*std::cout << "cs at " << j << " = " << cs[j] << std::endl;
	 std::cout << "cs at e =" << cs[e] << std::endl;
	 std::cout << "As at e,j = " << As(e,j) << std::endl; */
	cnew[j] = cs[j] - cs[e] * Anew(e, j);
}
cnew[lv] = -1 * cs[e] * Anew(e, lv);
// Compute new sets of basic and nonbasic variables
cs = cnew;
As = Anew;
bs = bnew;
N.erase(e);
N.insert(lv);
B.erase(lv);
B.insert(e);

}
/**

Converts the LP problem from Standard form to Slack form, initialises the data structures
to be used by the simplex algorithm. The slack form is such that the initial basic solution
is feasible. The function stops if the lp is infeasible or unbounded with a proper message.
*/
template
void simplex::initialize() {
obj_val = 0;
int M = A.size1(); // Number of constrains in the LP.
int D = A.size2(); // Number of variables of the LP.
dim = M + D;
// set variable 1 to N as the non-basic variables
for (unsigned int i = 1; i <= D; i++)
N.insert(i);
//set variables N+1 to M as the basic variables (the slack variables)
for (unsigned int i = 1; i <= M; i++)
B.insert(D + i);
// A larger coefficient matrix, constants vector, objective vector defined
As = math::matrix(dim + 1, dim + 1);
bs = std::vector(dim + 1, 0);
cs = std::vector(dim + 1, 0);
// populate the members of the As matrix, bs vector
typename std::set::iterator it;
assert(B.size() == b.size());
// As,bs,cs are initialised as 1 index arrays from A,b,c
int i = 0;
for (it = B.begin(); it != B.end(); it++, i++) {
int m = *it;
for (unsigned int j = 0; j < D; j++) {
As(m, j + 1) = A(i, j);
}
bs[m] = b[i];
}
for (it = N.begin(); it != N.end(); it++) {
cs[*it] = c[*it - 1];
}
}
template
void simplex::display_state() {
std::cout << “state of the simplex :\n”;
std::set::iterator it1;
std::cout << “Non basic variables:\n”;
for (it1 = N.begin(); it1 != N.end(); it1++) {
std::cout << *it1 << std::endl;
}
std::cout << “Basic variables:\n”;
for (it1 = B.begin(); it1 != B.end(); it1++) {
std::cout << *it1 << std::endl;
}
std::cout << “\nSize of the Total Variable : “<<cs.size()<<”\n”;
std::cout << “objective function coefficients:\n”;
for (int i = 1; i < cs.size(); i++)
std::cout << cs[i] << std::endl;

std::cout << “constants:\n”;
for (int i = 1; i < bs.size(); i++)
std::cout << bs[i] << std::endl;

std::cout << “state of the constraints matrix\n”;
for (int i = 1; i < As.size1(); i++) {
for (int j = 1; j < As.size2(); j++)
std::cout << As(i, j) << " ";
std::cout << std::endl;
}
std::cout << "Objective function value at current iteration = " << obj_val
<< std::endl;
std::cout
<< “============================================================= \n”;
}

template
std::vector simplex::solve() {
int n = N.size() + B.size();
std::vector x(n, 0);
/* //Amit:: Debug
for (unsigned int i = 1; i <= n; i++) {
std::cout << "Debug Amit :: delta at Creation " << i << " = " << delta[i] << std::endl;
}*/
int e, l, iters = 0;
//typename std::set::iterator it, it1,it2;

// start solving with simplex algorithm
while (iters <= max_iters) {
	std::vector<T> delta(n + 1, INT_MAX);	//AMIT : this declaration should be inside the loop to see the INT_MAX changes reflect
	e = -1;
	for (std::set<int>::iterator it = N.begin(); it != N.end(); it++) {
		if (cs[*it] <= 0)
			continue;
		e = *it;
		//std::cout << "Adding entering var =" << e << ", as cs[e] = " <<  cs[e] << std::endl;
		break;// after choosing the entering variable, find the leaving variable
	}
	if (e == -1) {
		/*
		 for(IT = B.begin(); IT!=B.end();IT++){
		 x[*IT] = bs[*IT];
		 }*/
		//		std::cout << "Landed inside exit condition\n";
		break;// break from the while loop;
	} else {
		std::cout << "The entering variable is " << e << std::endl;
		for (std::set<int>::iterator it2 = B.begin(); it2 != B.end();
				it2++) {
			int i = *it2;
			if (As(i, e) > 0) {
				delta[i] = bs[i] / As(i, e);

				std::cout << "delta at var " << i << " = " << delta[i]
						<< std::endl;
				std::cout << "bs at " << i << " = " << bs[i] << std::endl;
				std::cout << "As at i,e" << "= " << As(i, e) << std::endl;
			}
			//std::cout << "Amit: Coefficient at " << i << " = " << As(i, e)<<"\n";
		}

		double min = INT_MAX;
		for (unsigned int i = 1; i <= n; i++) {
			//Amit:: Debug
			std::cout << "Debug Amit :: delta at var " << i << " = " << delta[i] << std::endl;

			if (delta[i] < min) {
				min = delta[i];
				l = i;
			}
		}
		std::cout << "min delta = " << min << std::endl;
		std::cout << "leaving variable = " << l << std::endl;
		if (min == INT_MAX) {
			std::cout << "Unbounded LP";
			exit(0);
		} else {
			std::cout << "simplex state BEFORE calling pivot from solve:\n";
			//display_state();binayak

			pivot(e, l);
			std::cout << "simplex state AFTER calling pivot from solve:\n";
			//display_state();binayak
		}
		/*
		 std::cout << "simplex state after pivoting\n"  << std::endl;
		 display_state();
		 */
		iters++; // one simplex iteration complete
	}
} // end of while
return x;

}

template
void simplex::process_lp() {
unsigned int k, m, n;
double min = INT_MAX;
for (unsigned int i = 0; i < b.size(); i++) {
if (b[i] < min) {
min = b[i];
k = i;
}
}
if (b[k] >= 0) {
initialize();
// std::cout << “LP processed\n”;
return;
}
k = k + 1; // since we start indexing arrays at 1 in our simplex implementation
m = A.size1(); // no. of constraints of the LP
n = A.size2(); // no. of variables in the LP
std::cout << “k= " << k << " m =” << m << ", n = " << n << std::endl;

std::vector<T> x(m + n, 0);
bool flag = false;
math::matrix<T> Anew(m, n + 1); // no. of vars is with the added extra variable of the auxiliary LP.
c.resize(n + 1);
for (unsigned int i = 0; i < m; i++) {
	for (unsigned int j = 0; j < n; j++) {
		Anew(i, j) = A(i, j);
		if (!flag) // if c not yet changed to the obj of L_aux, then change it
			c[j] = 0;
	}
	Anew(i, n) = -1;
	flag = true;
}
c[n] = -1;
A = Anew;
// * debug purpose
std::cout << "Anew matrix\n" << std::endl;
for (unsigned int i = 0; i < A.size1(); i++) {
	for (unsigned int j = 0; j < A.size2(); j++) {
		std::cout << A(i, j) << " ";
	}
	std::cout << std::endl;
}
std::cout << "elements of cnew\n" << std::endl;
for (unsigned int i = 0; i < c.size(); i++) {
	std::cout << c[i] << std::endl;
}
std::cout << "elements of bnew\n" << std::endl;
for (unsigned int i = 0; i < b.size(); i++) {
	std::cout << b[i] << std::endl;
}
// * /

initialize(); // get the slack form

// std::cout << “Printing After Initialize\n” << std::endl;

int lv = N.size() + k; // set the leaving variable
int e = N.size(); // the entering variable is the var added in the Auxiliary LP.
std::cout << "leaving  = " << lv << ", entering = " << e << std::endl;
std::cout << "Before pivot Call\n";
//display_state();binayak
pivot(e, lv);	//Data Mismatch for matrix A detected here
//display_state();
std::cout << "After first pivot successful\n";
//display_state();binayak

solve(); // run simplex on the previously obtained slack form. it is guaranteed that the initial basic solution is feasible.

std::cout << "process_lp: reached after call to solve auxiliary LP"
		<< std::endl;
//display_state();binayak

if (get_obj_val() == 0) { // To be changed to check the value of x[e], e being the index of the auxiliary variable
	//if(x[lv] == 0) // the aux variable of the lp_aux has value 0 in the optimal vector
	int en;
	//std::cout<<"Amit:: x0 is lv =  "<< lv <<" or e = "<< e <<std::endl;
	for (std::set<int>::iterator it = B.begin(); it != B.end(); it++) {
		if (*it == e) { // aux variable as basic variable
			for (std::set<int>::iterator it1 = N.begin(); it1 != N.end();
					it1++) {
				if (As(e, *it1) != 0) {
					en = *it1;
					break;
				}
			}
			pivot(en, e);
		}
	}

	std::cout << "STATE BEFORE RESTORING TO ORIGINAL OBJECTIVE FUNCTION\n";
	//display_state();binayak

	//Anew = math::matrix<T>(As.size1() - 1, As.size2() - 1);
	Anew.resize(As.size1() - 1, As.size2() - 1); //AMIT resizing the array

	std::vector<T> bnew(bs.size() - 1, 0);
	std::vector<T> cnew(cs.size() - 1, 0);
// remove the row and column for the auxiliary variable from As
	int row = 0;
	for (unsigned int i = 1; i < As.size1(); i++) {
		if (i == e) {
			continue;
		}
		row++;
		int k = 1;
		for (unsigned int j = 1; j < As.size2(); j++) {
			if (j != e) {
				Anew(row, k) = As(i, j);
				k++;
			}
		}
	}

	/*
	 //AMIT debug
	 std::cout<<"debugging \n";
	 for (int x=1;x<Anew.size1();x++){
	 for (int y=1;y<Anew.size2();y++)
	 std::cout<<"Anew(x,y) = "<<Anew(x,y)<<"\t";
	 std::cout<<std::endl;
	 }
	 */

	int k = 1;
	for (unsigned int j = 1; j < As.size2(); j++) {
		if (j != e) {
			bnew[k] = bs[j];
			k++;
		}
	}
	As = Anew;
	bs = bnew;
	T v = T(0);
	T alpha;

	for (unsigned int i = 0; i < c_orig.size(); i++) {
		cnew[i + 1] = c_orig[i];
	}

	std::cout << "STATE AFTER RESTORING THE ORIGINAL OBJECTIVE FUNCTION\n";
	//display_state();binayak

	//debug

// std::cout << “values at cnew\n”;
// for(unsigned int i=1;i<cnew.size();i++)
// {
// std::cout << cnew[i] << std::endl;
// }
//—
//std::cout << "Value of new objective functions = " << std::endl;
for (unsigned int i = 1; i < cnew.size(); i++) {
if (cnew[i] != 0) {
for (typename std::set::iterator iter = B.begin();
iter != B.end(); iter++) {
if (*iter == i) {
alpha = cnew[i];
cnew[i] = 0;
for (unsigned int j = 1; j < As.size2(); j++) {
cnew[j] += -As(i, j) * alpha;
std::cout << “Amit::” << cnew[j] << std::endl;
}
std::cout << “end of new obj function\n” << std::endl;
v += bs[i] * alpha;
}
iter = B.end();
break;
}
}
}

	//AMIT debug
	std::cout << "values at cnew\n";
	for (unsigned int i = 1; i < cnew.size(); i++) {
		std::cout << cnew[i] << std::endl;
	}
	cs = cnew;

/* Debug::
//AMIT: Now resize cs as size - 1 for removing auxiliary variable
std::cout<<“Amit Gurung :: cs Size BEFORE = “<<cs.size()<<”\n”;
cs.resize(cs.size() - 1);
std::cout<<“Amit Gurung :: cs Size AFTER = “<<cs.size()<<”\n”;
*/

	/*
	 //AMIT debug
	 std::cout << "values at cs after copying\n";
	 for(unsigned int i=1;i<cs.size();i++)
	 {
	 std::cout << cs[i] << std::endl;
	 }
	 */
	std::cout << "Value of new objective function = " << v << std::endl;
	obj_val = v;

	/*
	 N.clear();
	 B.clear();
	 int D = c_orig.size();
	 for (unsigned int i = 1; i <= D; i++)
	 N.insert(i);

	 for (unsigned int i = 1; i <= A_orig.size1(); i++)
	 B.insert(D + i);
	 */

	//Code tried by :: AMIT
	std::set<int> tempN, tempB; // .
	tempN = N;
	tempB = B;
	N.clear();
	B.clear();

	for (typename std::set<int>::iterator iter = tempN.begin();
			iter != tempN.end(); iter++) {
		if (*iter == e) {	//e is the x0 variable or the auxiliary variable
			continue;	//skiping or removing the auxiliary variable
		}
		//Inserting/Renaming the remaining variable but if the variables are greater than auxiliary variable name/index subtracted by 1
		if (*iter > e)
			N.insert(*iter - 1);		//requires variable renaming
		else
			N.insert(*iter);		//Does Not requires variable renaming
	}

	for (typename std::set<int>::iterator iter = tempB.begin();
			iter != tempB.end(); iter++) {
		//Inserting/Renaming the variable but if the variables are greater than auxiliary variable name/index subtracted by 1
		if (*iter > e)
			B.insert(*iter - 1);		//requires variable renaming
		else
			B.insert(*iter);		//Does Not requires variable renaming
	}

	std::cout << "LP processed\n";
	//display_state(); binayak
	std::cout<<"\n===============Initialize Simplex Call Over=====================\n";
} else {
	std::cout << "Simplex::process_lp: LP has infeasible solution\n";
	exit(0);
}

}

Binayak · March 17, 2015, 4:45am

If I be able to use ublas libray in cuda progarmming, then I may able to make the same LP Solver using simplex on my own in cuda.

Skybuck · March 18, 2015, 1:35pm

Some questions for Binaryak:

Do you understand the simplex algorithm ? Are you capable of writing one from scratch ?
I assume you do not yet know how to program CUDA or parallel solutions suited for cuda ?
Why do you want to run two simplex algorithms at the same time ? (most interesting question)
Have you tried to parallelize the simplex algorithm ?
Do you think the simplex algorithm can be parallelized ?

little_jimmy · March 20, 2015, 5:13am

“Why do you want to run two simplex algorithms at the same time ? (most interesting question)”

why wouldn’t you?

if you can cut the feasible region, similar to a cut and branch approach, you would be able to run multiple instances on the same overall feasible region

i was under the impression that the simplex algorithm’s performance - or simply calculation time - is dependent on the starting point
with multiple instances - the feasible region cut into smaller regions - you should have multiple starting points, and thus potentially better odds
particularly when you have a plethora of lp coefficients

not so?

Binayak · March 20, 2015, 10:38am

Some questions for Binaryak:

Do you understand the simplex algorithm ? Are you capable of writing one from scratch ?
I assume you do not yet know how to program CUDA or parallel solutions suited for cuda ?
Why do you want to run two simplex algorithms at the same time ? (most interesting question)
Have you tried to parallelize the simplex algorithm ?
Do you think the simplex algorithm can be parallelized ?

Answer to question 1. I went through the algorithm, I know it is working. I can write a Simplex program by using this same algorithm from book(Introduction to Algorithm by Thomas H. Cormen)

Answer to question 2. Yes I am new in cuda programming, Just came now to learn Cuda for my this work only.

Answer to question 3. I don’t know the answer right now. I just my expert has a tool and that tool needs a Parellel LPP solver

Answer to question 4. Why not? Yes It’s possible…And I’ll have to do the same. Its my work now on me…

Thank you :)

Topic		Replies	Views
Polite ask for help with paralelling program using CUDA. Task: Rewrite given program so it will use CUDA Programming and Performance	13	9421	April 20, 2011
An Easy Introduction to CUDA C and C++ Technical Blog	48	1840	July 19, 2018
Effective Parallelisation of CUDA C code CUDA Programming and Performance	38	2445	December 27, 2021
LU factorization code CUDA Programming and Performance	45	91283	June 23, 2015
LU, QR and Cholesky factorizations using GPU CUDA Programming and Performance	100	63999	June 23, 2015
How can I use __device__ function pointer in CUDA ? CUDA Programming and Performance	34	60874	June 3, 2020
Nvfortran error nvc, nvc++ and nvfortran	39	3903	January 17, 2024
Can't call a function with CUDA when i compile my code, the "add" function just won' CUDA Programming and Performance	23	5599	November 21, 2010
How can I improve this code which only reduces half time for the same code using MATLAB, thanks! CUDA Programming and Performance	5	1720	September 29, 2009
Class function returning error 700 Legacy PGI Compilers	5	610	June 24, 2020

Calling C function in Device code.

Related topics