Hi, I want to capture layer-wise information from my GAN application. From that information, I want to visualize total time taken (layer-wise) and stall breakdown information (this too layer-wise).
Here is sample code from my GAN application :
# G(z) class generator(nn.Module): # initializers def __init__(self, d=128): super(generator, self).__init__() self.deconv1 = nn.ConvTranspose2d(100, d*8, 4, 1, 0) self.deconv1_bn = nn.BatchNorm2d(d*8) self.deconv2 = nn.ConvTranspose2d(d*8, d*4, 4, 2, 1) self.deconv2_bn = nn.BatchNorm2d(d*4) self.deconv3 = nn.ConvTranspose2d(d*4, d*2, 4, 2, 1) self.deconv3_bn = nn.BatchNorm2d(d*2) self.deconv4 = nn.ConvTranspose2d(d*2, d, 4, 2, 1) self.deconv4_bn = nn.BatchNorm2d(d) self.deconv5 = nn.ConvTranspose2d(d, 1, 4, 2, 1)
I have a couple of questions here :
- Let’s say I want to monitor deconv1 layer, should I put it in –kernel argument ?
What should my nvprof argument look like ?
- How can I capture the invocation order , kernel ID and kernel name through nvprof ?
I found something like this
--kernels <kernel path syntax> This option changes the scope of subsequent "--events", "--metrics" options. The syntax is as following: <kernel name> or <context id/name>:<stream id/name>:<kernel name>:<invocation> The context/stream IDs, names, kernel name and invocation can be regular expressions. Empty string matches any number or characters. If <context id/name> or <stream id/name> is a positive number, it's strictly matched against the CUDA context/stream ID. Otherwise it's treated as a regular expression and matched against the context/stream name
Can anyone please tell me what is context id , stream id , invocation here ?