Hello everyone.
I am trying to implement a custom Mobilenet-V1 network (Knowledge Distillation student) as the base network of an SSD network. In the MobileNet-V1 I have implemented the alpha parameter (Width multiplier) set to 0.25.
The architecture of the model is therefore as follows:
self.model = nn.Sequential(
conv_bn( 3, int(32* alpha), 2),
conv_dw( int(32* alpha), int(64* alpha), 1),
conv_dw( int(64* alpha), int(128* alpha), 2),
conv_dw(int(128* alpha), int(128* alpha), 1),
conv_dw(int(128* alpha), int(256* alpha), 2),
conv_dw(int(256* alpha), int(256* alpha), 1),
conv_dw(int(256* alpha), int(512* alpha), 2),
conv_dw(int(512* alpha), int(512* alpha), 1),
conv_dw(int(512* alpha), int(512* alpha), 1),
conv_dw(int(512* alpha), int(512* alpha), 1),
conv_dw(int(512* alpha), int(512* alpha), 1),
conv_dw(int(512* alpha), int(512* alpha), 1),
conv_dw(int(512* alpha), int(1024* alpha), 2),
conv_dw(int(1024* alpha), int(1024* alpha), 1),
)
self.fc = nn.Linear(int(1024* alpha), num_classes)
Now, having to insert this model in an SSD architecture, I modified the convolutional layers of the latter by multiplying them in turn with the alpha parameter, obtaining the following architecture:
extras = ModuleList([
Sequential(
Conv2d(in_channels=int(1024*alpha), out_channels=int(256*alpha), kernel_size=1),
ReLU(),
Conv2d(in_channels=int(256*alpha), out_channels=int(512*alpha), kernel_size=3, stride=2, padding=1),
ReLU()
),
Sequential(
Conv2d(in_channels=int(512*alpha), out_channels=int(128*alpha), kernel_size=1),
ReLU(),
Conv2d(in_channels=int(128*alpha), out_channels=int(256*alpha), kernel_size=3, stride=2, padding=1),
ReLU()
),
Sequential(
Conv2d(in_channels=int(256*alpha), out_channels=int(128*alpha), kernel_size=1),
ReLU(),
Conv2d(in_channels=int(128*alpha), out_channels=int(256*alpha), kernel_size=3, stride=2, padding=1),
ReLU()
),
Sequential(
Conv2d(in_channels=int(256*alpha), out_channels=int(128*alpha), kernel_size=1),
ReLU(),
Conv2d(in_channels=int(128*alpha), out_channels=int(256*alpha), kernel_size=3, stride=2, padding=1),
ReLU()
)
])
regression_headers = ModuleList([
Conv2d(in_channels=int(512*alpha), out_channels=6 * 4, kernel_size=3, padding=1),
Conv2d(in_channels=int(1024*alpha), out_channels=6 * 4, kernel_size=3, padding=1),
Conv2d(in_channels=int(512*alpha), out_channels=6 * 4, kernel_size=3, padding=1),
Conv2d(in_channels=int(256*alpha), out_channels=6 * 4, kernel_size=3, padding=1),
Conv2d(in_channels=int(256*alpha), out_channels=6 * 4, kernel_size=3, padding=1),
Conv2d(in_channels=int(256*alpha), out_channels=6 * 4, kernel_size=3, padding=1), # TODO: change to kernel_size=1, padding=0?
])
classification_headers = ModuleList([
Conv2d(in_channels=int(512*alpha), out_channels=6 * num_classes, kernel_size=3, padding=1),
Conv2d(in_channels=int(1024*alpha), out_channels=6 * num_classes, kernel_size=3, padding=1),
Conv2d(in_channels=int(512*alpha), out_channels=6 * num_classes, kernel_size=3, padding=1),
Conv2d(in_channels=int(256*alpha), out_channels=6 * num_classes, kernel_size=3, padding=1),
Conv2d(in_channels=int(256*alpha), out_channels=6 * num_classes, kernel_size=3, padding=1),
Conv2d(in_channels=int(256*alpha), out_channels=6 * num_classes, kernel_size=3, padding=1), # TODO: change to kernel_size=1, padding=0?
])
The question is: do you think I will encounter some problems regarding the size of the images inserted in the base network and for the bounding boxes produced by the final SSD model?
Thank you.