How can I use CUDA visual profiler with JCUDA application?

I have a JCUDA program that runs concurrent kernel executions with four overlapped operations:
Overlapped host computation and device computation
Overlapped host computation and host-device data transfer
Overlapped host-device data transfer and device computation
Concurrent device computation

How can I use visual profiler to show the timings of these operations?

Start the visual profiler using nvvp. Then launch the application (ie. java plus your jar) using the standard method to launch applications from nvvp. You will get a timeline in nvvp that will show all this.

Thanks Robert. But I have started the visual profiler and open a new session and get the jar file into File in the New Session window. It gives a message, unable to profile application., the application being profiled returned a non zero code. Why?

Because you can’t profile just the jar file. Note what I said:

How exactly do you run your application (without the profiler)? What is the command line that you use to run your application?

You mean:
I run it from the command prompt cmd. nvvp on me device is in the path :
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin\nvvp
Then the command to run the visual profiler:
nvpp java -cp “.;jcuda-0.10.0.jar;jcuda-natives-0.10.0-windows-x86_64.jar” JCudaRuntimeTest. java

where java -cp “.;jcuda-0.10.0.jar;jcuda-natives-0.10.0-windows-x86_64.jar” JCudaRuntimeTest. java (from the site: jcuda.org - Tutorial)

Do you mean this?

You mean once nvpp appears in cmd through the path C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin\nvvp

I write
nvpp java “D:\NetBeanProjects\OntologyThresholdSerial2023Test\dist\OntologyThresholdSerial2023Test.jar”

There is no nvpp

The command to start the visual profiler is just:

nvvp

So just type that command:

nvvp

and nothing else.

Once the visual profiler starts, then follow the instructions here.

Specifically:

  • in the application to be run, put java

  • in the command line options, put:

    -cp “.;jcuda-0.10.0.jar;jcuda-natives-0.10.0-windows-x86_64.jar” JCudaRuntimeTest.java
    

or:

   “D:\NetBeanProjects\OntologyThresholdSerial2023Test\dist\OntologyThresholdSerial2023Test.jar”

I don’t know which is correct, and you haven’t given me a clear answer to my question.

If you still need help, please answer the following question carefully:

Pretend that you weren’t going to be running the profiler. You just want to run your application, from the command line. No profiler. What command would you type?

I run a java project inside netbeans from run project. I open visual profiler and create new session. It gives me a window of the title create new session which contains:

File [Enter executable file] Browse
Working Directory [Enter working directory ] Browse
Arguments

If you have a Jcuda project , What can I fill these commands to give the time inside GPU.

Microsoft Windows [Version 6.3.9600]
(c) 2013 Microsoft Corporation. All rights reserved.

C:\Users\Computer Shop>cd\

C:>cd C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin>nvprof --profile-ch
ild-processes java -cp .;D:\NetBeanProjects\OntologyThresholdSerial2023Test\dist
\OntologyThresholdSerial2023Test.jar D:\NetBeanProjects\OntologyThresholdSerial2
023Test\src\ontologythresholdserial2023test\OntologyThresholdSerial2023Test.java

D:\NetBeanProjects\OntologyThresholdSerial2023Test\src\ontologythresholdserial20
23test\OntologyThresholdSerial2023Test.java:433: error: cannot find symbol
int max = pL.GetMaxLength(cSC.getLstCsrcF(),cSC.getLstCdest());
^
symbol: method GetMaxLength(List<ArrayList>,List)
location: variable pL of type ParallelLevenstein
D:\NetBeanProjects\OntologyThresholdSerial2023Test\src\ontologythresholdserial20
23test\OntologyThresholdSerial2023Test.java:435: error: method ExecuteParalleliz
ationOnStructuresApproach2 in class ParallelLevenstein cannot be applied to give
n types;
pL.ExecuteParallelizationOnStructuresApproach2(cSC.getLstCsrcF(),cSC.getL
stCdest(),max);
^
required: List<ArrayList>,List
found: List<ArrayList>,List,int
reason: actual and formal argument lists differ in length
2 errors
error: compilation failed
======== Warning: No CUDA application was profiled, exiting
======== Error: Application returned non-zero code 1

I write

nvprof java -cp .;[path of jar file] [path of main class]
It gives me the above message. Knowing that the project is executed successfuly via netbeans.

If you’re unable or unwilling to answer my question, I won’t be able to help you.

In order to profile your code with nvvp, you must be able to run your code from a command line, without netbeans or any other IDE.

If you’re not able to tell me how to do that, I can’t help you.

As I have already indicated:

File [Enter executable file] Browse ----> java
Working Directory [Enter working directory ] Browse -------> the directory where your jar is located
Arguments ------> your jar
If you can’t tell me exactly what belongs in “your jar”, I can’t tell you either.

Good luck!

I will try to run the project from command prompt and tell you. I am so sorry for disturbing you.

Dear Robert,

I successfully run the program in command line and also nvprof via the following commands.

java -cp jar path.jarname packagename.mainclassname

for nvprof:
it locates inside C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin

when coming to this path, set the command
nvprof --profile-child-processes java -cp jar path.jarnamepackagename.mainclassname

it is successfully executed.

But visual profiler still does not be excuted
I put as you said,

File [Enter executable file] Browse ----> java
Working Directory [Enter working directory ] Browse -------> the directory where your jar is located
Arguments ------> your jar
Can you help me?

File [Enter executable file] Browse ----> java
Working Directory [Enter working directory ] Browse -------> the directory where your jar is located
Arguments ------> -cp jar path.jarname packagename.mainclassname

and you may need to make the “profile child process” selection in the dialog, as mentioned here.

The muti-process profiling options are:

  • Profile child processes - If selected, profile all processes launched by the specified application.

When I write commands as you said in the visual profiler, it takes a long time and still running for a long time. Above the green bar of execution, Running nvprof to profile all processes. Below the bar, Run the application in a separate terminal outside visual profiler. To stop profiling, press the cancel button. Is there something still wrong?

Another question, if this can not operate, I see another solution inside visual profiler, File…import
then request CSV data generated by command line profiler.
You know that nvprof is executed for the same project in command line. How to make this CSV file? Sorry for annoying You.

The visual profiler collects a lot of data. That means it can take a long time. You can reduce the scope of what the visual profiler does, please read the documentation.

Yes, it’s possible to import data from nvprof to the visual profiler, please read the documentation.