Hi, I want to capture layer-wise information from my GAN application. From that information, I want to visualize total time taken (layer-wise) and stall breakdown information (this too layer-wise).
Here is sample code from my GAN application :
# G(z)
class generator(nn.Module):
# initializers
def __init__(self, d=128):
super(generator, self).__init__()
self.deconv1 = nn.ConvTranspose2d(100, d*8, 4, 1, 0)
self.deconv1_bn = nn.BatchNorm2d(d*8)
self.deconv2 = nn.ConvTranspose2d(d*8, d*4, 4, 2, 1)
self.deconv2_bn = nn.BatchNorm2d(d*4)
self.deconv3 = nn.ConvTranspose2d(d*4, d*2, 4, 2, 1)
self.deconv3_bn = nn.BatchNorm2d(d*2)
self.deconv4 = nn.ConvTranspose2d(d*2, d, 4, 2, 1)
self.deconv4_bn = nn.BatchNorm2d(d)
self.deconv5 = nn.ConvTranspose2d(d, 1, 4, 2, 1)
I have a couple of questions here :
- Let’s say I want to monitor deconv1 layer, should I put it in –kernel argument ?
What should my nvprof argument look like ?
- How can I capture the invocation order , kernel ID and kernel name through nvprof ?
Edit
I found something like this
--kernels <kernel path syntax>
This option changes the scope of subsequent "--events", "--metrics"
options. The syntax is as following:
<kernel name>
or
<context id/name>:<stream id/name>:<kernel name>:<invocation>
The context/stream IDs, names, kernel name and invocation
can be regular expressions. Empty string matches any number
or characters. If <context id/name> or <stream id/name>
is a positive number, it's strictly matched against the
CUDA context/stream ID. Otherwise it's treated as a regular
expression and matched against the context/stream name
https://helpmanual.io/help/nvprof/
Can anyone please tell me what is context id , stream id , invocation here ?