printf in CUDA cuda, main function

I translate a fortran program into CUDA. And in this subroutine, there are lots of “printf”, but device cannot “printf”, so I want to put them in the main function. But these values of variables in “printf” are carried out step by step. How can I make them in main function?

Thanks :)

Your Fortran program contains printf? There could be some significant issues with the codebase…

To answer the generic question, you have to copy the values back from the GPU to the CPU whenever you want to print them out (use a cudaMemcpy between kernel invocations). However, that’s going to be a touch on the slow side, so you might like to ask why those variables are being printed out in the first place.

I mean that there are many values should be put out in fortran program, “printf” is just the syntax used in CUDA.

I know that I have to copy these values to the CPU, but there are many variables used several times, and their values in the subroutine are changed.

for example, first I initialize the variable sum=0, then I put sum=a+b, and after I initialize “sum=0” again, and I assign it another value, but these two values I want to print both of them. In the main function, how can I do then? and in the kernel, is it right that I just delete all sentences about “printf” in the fortran program and write nothing instead or I should write sth like “return” or anything else instead?

Thanks a lot!!! :rolleyes:

I mean that there are many values should be put out in fortran program, “printf” is just the syntax used in CUDA.

I know that I have to copy these values to the CPU, but there are many variables used several times, and their values in the subroutine are changed.

for example, first I initialize the variable sum=0, then I put sum=a+b, and after I initialize “sum=0” again, and I assign it another value, but these two values I want to print both of them. In the main function, how can I do then? and in the kernel, is it right that I just delete all sentences about “printf” in the fortran program and write nothing instead or I should write sth like “return” or anything else instead?

Thanks a lot!!! rolleyes.gif

Why are those printf (or rather, WRITE) statements there? They will be killing performance on the CPU side as it is (formatted I/O is very expensive). Can you show some actual code, and then you might get more useful advice.

You could create a return array that will content all informations your need to output (allocate enough space for the worst case!!!), feed it with each value (eventually copy the labels associated if value order may change) and then return it at the end of kernel execution. You will display them using a simple loop in your CPU C code

This is a part of the program in fortran

sum=0.

	is=0

	sum=0.

	asum=0.

	do i=1,n

		if (poids(i) .ne. 0.d0 ) then

			sum=sum+pixel(i)*poids(i)

			asum=asum+poids(i)

			is=is+1

		endif

	enddo

	if ( is .le. 10 ) then

		write (*,*) ' omilieu flag pb', is,n,pixel(1),pixel(2),poids(1),poids(2)

		read (*,*) is

	endif

	cte=sum/asum

	sum=0.

	do i=1,n

		if (poids(i) .ne. 0.d0 ) then

			ecart(i)=pixel(i)-cte

			sum=sum+ecart(i)**2

		endif

	enddo

	sig=SQRT(sum/(is-1))

	if ( sig .lt. 1.d-30) then

		write (*,'(a,3i10,2f15.3)') ' signal const:',n,is,isg,cte,sig

		read (*,*) is

	endif

Thanks. I will have a try :rolleyes:

What’s the value of ‘n’? Can’t you just make the individual loops kernels, or is the iteration count too small? In any event, a READ(,) will be killing performance, since it waits for keyboard input.