Dlprof's database file can't be open with dlprofviewer

Hi
I’m trying to profile my model (written in pytorch) with DLProf since pyprof is discontinued.
This is the command I used

sudo env “PATH=$PATH” dlprof -f true --reports=summary,detail,iteration,kernel,tensor --profile_name main-1 train.py --num-epochs 2

It will create .qdrep and .sqlite file.

But if I do dlprofviewer xx.sqlite, as instructed by the doc. I would get this error

dlprofviewer /tmp/nsys-report-e7ec-cffc-e4db-7208.sqlite
[dlprofviewer-07:24:08 PM UTC] dlprofviewer running at http://localhost:8000
Sqlite3 error: no such table: view_system_config
Here is the problem query: <SELECT is_valid,num_gpus,cpu_model,driver_version,framework,cuda_version,cudnn_version,nsys_version,dlprof_version,dlprof_build,profile_name,mode_string FROM view_system_config>
Using args:  []
[dlprofviewer-07:24:11 PM UTC] Internal Server Error: /dlprof/dlprof/rest/sysconfig_panel
Traceback (most recent call last):
  File "/miniconda3/envs/base/lib/python3.8/site-packages/asgiref/sync.py", line 482, in thread_handler
    raise exc_info[1]
  File "/miniconda3/envs/base/lib/python3.8/site-packages/django/core/handlers/exception.py", line 38, in inner
    response = await get_response(request)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/django/core/handlers/base.py", line 233, in _get_response_async
    response = await wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/asgiref/sync.py", line 444, in __call__
    ret = await asyncio.wait_for(future, timeout=None)
  File "/miniconda3/envs/base/lib/python3.8/asyncio/tasks.py", line 455, in wait_for
    return await fut
  File "/miniconda3/envs/base/lib/python3.8/site-packages/asgiref/current_thread_executor.py", line 22, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/asgiref/sync.py", line 486, in thread_handler
    return func(*args, **kwargs)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/dlprofviewer/dlprofwebserver-project/dlprof/rest_endpoints.py", line 81, in sysconfig_panel
    return process_endpoint(obj)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/dlprofviewer/dlprofwebserver-project/dlprof/rest_endpoints.py", line 110, in process_endpoint
    result = obj.get_result()
  File "/miniconda3/envs/base/lib/python3.8/site-packages/dlprofviewer/dlprofwebserver-project/dlprof/endpoints/sql.py", line 77, in get_one_result
    db_rows = self.do_query(self._request, self._sql_query, self._sql_args)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/dlprofviewer/dlprofwebserver-project/dlprof/endpoints/sql.py", line 45, in do_query
    raise error
  File "/miniconda3/envs/base/lib/python3.8/site-packages/dlprofviewer/dlprofwebserver-project/dlprof/endpoints/sql.py", line 37, in do_query
    cursor.execute(sql_query, sql_args)
sqlite3.OperationalError: no such table: view_system_config
Sqlite3 error: no such table: view_aggregations
Here is the problem query: <SELECT aggr_id,iter_start,iter_stop,iter_aggregated,key_node_name,user_name,host_name,aggr_start,aggr_end FROM view_aggregations ORDER BY aggr_start>
Using args:  []
[dlprofviewer-07:24:11 PM UTC] Internal Server Error: /dlprof/dlprof/rest/aggregation_data
Traceback (most recent call last):
  File "/miniconda3/envs/base/lib/python3.8/site-packages/asgiref/sync.py", line 482, in thread_handler
    raise exc_info[1]
  File "/miniconda3/envs/base/lib/python3.8/site-packages/django/core/handlers/exception.py", line 38, in inner
    response = await get_response(request)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/django/core/handlers/base.py", line 233, in _get_response_async
    response = await wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/asgiref/sync.py", line 444, in __call__
    ret = await asyncio.wait_for(future, timeout=None)
  File "/miniconda3/envs/base/lib/python3.8/asyncio/tasks.py", line 455, in wait_for
    return await fut
  File "/miniconda3/envs/base/lib/python3.8/site-packages/asgiref/current_thread_executor.py", line 22, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/asgiref/sync.py", line 486, in thread_handler
    return func(*args, **kwargs)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/dlprofviewer/dlprofwebserver-project/dlprof/rest_endpoints.py", line 37, in aggregation_data
    return process_endpoint(obj)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/dlprofviewer/dlprofwebserver-project/dlprof/rest_endpoints.py", line 110, in process_endpoint
    result = obj.get_result()
  File "/miniconda3/envs/base/lib/python3.8/site-packages/dlprofviewer/dlprofwebserver-project/dlprof/endpoints/sql.py", line 63, in get_result
    db_rows = self.do_query(self._request, self._sql_query, self._sql_args)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/dlprofviewer/dlprofwebserver-project/dlprof/endpoints/sql.py", line 45, in do_query
    raise error
  File "/miniconda3/envs/base/lib/python3.8/site-packages/dlprofviewer/dlprofwebserver-project/dlprof/endpoints/sql.py", line 37, in do_query
    cursor.execute(sql_query, sql_args)
sqlite3.OperationalError: no such table: view_aggregations
Sqlite3 error: no such table: view_domains
Here is the problem query: <SELECT domain_name FROM view_domains ORDER BY domain_name >
Using args:  []
[dlprofviewer-07:24:11 PM UTC] Internal Server Error: /dlprof/dlprof/rest/domain_data
Traceback (most recent call last):
  File "/miniconda3/envs/base/lib/python3.8/site-packages/asgiref/sync.py", line 482, in thread_handler
    raise exc_info[1]
  File "/miniconda3/envs/base/lib/python3.8/site-packages/django/core/handlers/exception.py", line 38, in inner
    response = await get_response(request)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/django/core/handlers/base.py", line 233, in _get_response_async
    response = await wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/asgiref/sync.py", line 444, in __call__
    ret = await asyncio.wait_for(future, timeout=None)
  File "/miniconda3/envs/base/lib/python3.8/asyncio/tasks.py", line 455, in wait_for
    return await fut
  File "/miniconda3/envs/base/lib/python3.8/site-packages/asgiref/current_thread_executor.py", line 22, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/asgiref/sync.py", line 486, in thread_handler
    return func(*args, **kwargs)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/dlprofviewer/dlprofwebserver-project/dlprof/rest_endpoints.py", line 41, in domain_data
    return process_endpoint(obj)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/dlprofviewer/dlprofwebserver-project/dlprof/rest_endpoints.py", line 110, in process_endpoint
    result = obj.get_result()
  File "/miniconda3/envs/base/lib/python3.8/site-packages/dlprofviewer/dlprofwebserver-project/dlprof/endpoints/sql.py", line 63, in get_result
    db_rows = self.do_query(self._request, self._sql_query, self._sql_args)
  File "/miniconda3/envs/base/lib/python3.8/site-packages/dlprofviewer/dlprofwebserver-project/dlprof/endpoints/sql.py", line 45, in do_query
    raise error
  File "/miniconda3/envs/base/lib/python3.8/site-packages/dlprofviewer/dlprofwebserver-project/dlprof/endpoints/sql.py", line 37, in do_query
    cursor.execute(sql_query, sql_args)
sqlite3.OperationalError: no such table: view_domains

Env info:

conda 4.8.1
python3.8

conda list | grep cuda
cudatoolkit               11.1.1               h6406543_8    conda-forge
pytorch                   1.9.0           py3.8_cuda11.1_cudnn8.0.5_0    pytorch

pip list | grep nvidia
nvidia-dlprof                     1.2.0
nvidia-dlprof-pytorch-nvtx        1.2.0
nvidia-dlprofviewer               1.2.0
nvidia-ml-py3                     7.352.0
nvidia-nsys-cli                   2021.2.1.58
nvidia-pyindex                    1.0.9

Hi Jason, thanks for posting.

There will be 3 outputs from dlprof if it completes successfully:
nsys qdrep file
nsys sqlite file
dlprof sqlite file

The dlprof sqlite file is the one that can be consumed by the viewer. I see that you are trying to use the nsys sqlite file, which is why the viewer is failing.

The fact that the nsys database is in /tmp/ tells me that the underlying nsys run did not complete properly. Can you look closer at the output from the dlprof run and see if you can find an error there? I’ll need that information to further help you.

Can you also tell me if you are running from inside an nvidia DLFW container, or if you manually pip installed the dlprof components on your machine?

Thanks,
Tim

Thanks for responding.
I see. I never saw the dlprof sqlite file on my local dir, only the 2 nsys files.
One of the error I got earlier was saying it doesn’t have permission to get CPU activity or something. That’s why I added sudo in the front. Then that error seems to go away.

The machine rebooted, so I’ll have to run it again. I’ll post the output in a day or so.
And I’m not running in a docker, I manually installed them using pip.
Thanks

Actually I’m all set. The dlprof’s sqlite file name different from the doc since I used --profile_name.
So it is actually present in the result. I can open it with dlprofviewer.
Thanks a lot!

Wonderful! Let me know if you have any feedback for DLProf and the viewer.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.