Unable to capture iterations on dlprof

I’m trying to profile inferences of a tiny model with dlprof, but i can’t seem to capture iteration information when i let it run for multiple iterations, this is what the code does

class SmallModel(nn.Module):

    def __init__(self):
        super(SmallModel, self).__init__()
        self.layer1 = nn.Linear(784, 512)
        self.layer2 = nn.Linear(512, 256)

    def forward(self, x):
        x = torch.relu(self.layer1(x))
        x = torch.relu(self.layer2(x))
        return x
model = SmallModel().cuda().half()
input_data = torch.randn(64, 784).cuda().half()


parser = argparse.ArgumentParser("Nvidia Profiler")
parser.add_argument("--num_iter", dest='num_iter', help="no of iterations to perform", type=int)
args = parser.parse_args()
with torch.no_grad():
        with torch.autograd.profiler.emit_nvtx():
            for i in range(args.num_iter):
                _ = model(input_data)

this is the command i’m running → dlprof --mode=pytorch --key_node=LINEAR_1 -f true --reports=summary,detail,iteration --iter_start=5 --iter_stop=8 python profile_sample_model.py --num_iter 10

this is what the dlprof log generates:

Found 2 iterations using key_op “LINEAR_1”
Iterations: [12495162999, 12520617892]
Aggregating data over 1 iterations: iteration 1 start (12495162999 ns) to iteration 1 end (12520617892 ns)

i want dlprof to capture from iter 5 to iter 8 independently, instead it skips aggregation until the first instance it encounters the specified key_node and then aggregates the rest of the 9 iterations as a one iteration, what am i doing wrong here, --iter_start=5 --iter_stop=8 doesn’t seem to have any effect, really appreciate any guidance on this

Hi, @Kirito94

Note that DLProf has been sunset for over two years and is no longer supported. It is recommended to use NSight Systems or one of the profiling tools from Pytorch or Tensorflow.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.