Unable to capture iterations on dlprof

Kirito94 · March 22, 2024, 8:13am

I’m trying to profile inferences of a tiny model with dlprof, but i can’t seem to capture iteration information when i let it run for multiple iterations, this is what the code does

class SmallModel(nn.Module):

    def __init__(self):
        super(SmallModel, self).__init__()
        self.layer1 = nn.Linear(784, 512)
        self.layer2 = nn.Linear(512, 256)

    def forward(self, x):
        x = torch.relu(self.layer1(x))
        x = torch.relu(self.layer2(x))
        return x

model = SmallModel().cuda().half()
input_data = torch.randn(64, 784).cuda().half()

nvidia_dlprof_pytorch_nvtx.init(enable_function_stack=True)

parser = argparse.ArgumentParser("Nvidia Profiler")
parser.add_argument("--num_iter", dest='num_iter', help="no of iterations to perform", type=int)
args = parser.parse_args()

with torch.no_grad():
        with torch.autograd.profiler.emit_nvtx():
            for i in range(args.num_iter):
                _ = model(input_data)

this is the command i’m running → dlprof --mode=pytorch --key_node=LINEAR_1 -f true --reports=summary,detail,iteration --iter_start=5 --iter_stop=8 python profile_sample_model.py --num_iter 10

this is what the dlprof log generates:

Found 2 iterations using key_op “LINEAR_1”
Iterations: [12495162999, 12520617892]
Aggregating data over 1 iterations: iteration 1 start (12495162999 ns) to iteration 1 end (12520617892 ns)

i want dlprof to capture from iter 5 to iter 8 independently, instead it skips aggregation until the first instance it encounters the specified key_node and then aggregates the rest of the 9 iterations as a one iteration, what am i doing wrong here, --iter_start=5 --iter_stop=8 doesn’t seem to have any effect, really appreciate any guidance on this

veraj · March 26, 2024, 6:31am

Hi, @Kirito94

Note that DLProf has been sunset for over two years and is no longer supported. It is recommended to use NSight Systems or one of the profiling tools from Pytorch or Tensorflow.

veraj · April 16, 2024, 6:37am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Dlprof not generating event files Profiling Linux Targets nsight	0	684	May 18, 2021
Error in sampling pytroch profile with nsys and dlprof Profiling Linux Targets nsight	3	2126	October 7, 2023
DLProf error during report generation Profiling Linux Targets	5	818	February 16, 2023
Dlprof unable to create dlprof_dldb.sqlite Profiling Linux Targets nsight , deep-learning-profiler	1	2331	July 1, 2022
DLProf Pytorch NVTX annotations overhead Profiling Linux Targets nsight , pytorch	0	1111	September 9, 2021
Dlprof with pytorch's distributed dataparallel Profiling Linux Targets nsight , deep-learning-profiler , deep-learning	2	777	September 28, 2022
How to control profiling start time using Nsight System gui like --capture-range=cudaProfilerApi in cli Profiling Linux Targets nsight	12	4651	April 4, 2023
DLProf Error Visual Profiler and nvprof nsight	3	1687	April 7, 2023
DLProf crash Profiling Linux Targets nsight , deep-learning-profiler	10	2152	September 1, 2021
No GPU associated to the given UUID Profiling Linux Targets	6	822	July 18, 2024

Unable to capture iterations on dlprof

Related topics