i need help to improve the learnability of my behavioral cloning model.
i couldn’t see much of loss changes.
Can you please suggest me the changes could improve it.
i am thinking of attention, but i would like to keep it once if it reaches certain movements to finetune.
Here is my model architecture
===================================================================================================================
Layer (type:depth-idx) Output Shape Param #
===================================================================================================================
DataParallel [1, 16] --
├─CustomModel: 1-1 [1, 16] 124,313,708
├─CustomModel: 1-2 -- --
│ └─Sequential: 2-1 [96, 1280, 1, 1] 4,007,548
│ └─Sequential: 2-2 -- --
│ │ └─Sequential: 3-1 [96, 1280, 5, 9] 4,007,548
│ │ └─Sequential: 3-2 -- 4,007,548
│ │ └─AdaptiveAvgPool2d: 3-3 [96, 1280, 1, 1] --
│ └─ConvLSTM2D: 2-3 [1, 96, 1280, 1, 1] 117,964,800
│ └─ConvLSTM2D: 2-4 -- --
│ │ └─ModuleList: 3-4 -- 117,964,800
│ └─Linear: 2-5 [1, 512] 655,360
│ └─Dropout: 2-6 [1, 512] --
│ └─Conv2d: 2-7 [1, 256, 1, 1] 1,179,648
│ └─GroupNorm: 2-8 [1, 256, 1, 1] 512
│ └─Conv2d: 2-9 [1, 124, 1, 1] 285,696
│ └─GroupNorm: 2-10 [1, 124, 1, 1] 248
│ └─Conv2d: 2-11 [1, 64, 1, 1] 71,424
│ └─GroupNorm: 2-12 [1, 64, 1, 1] 128
│ └─Conv2d: 2-13 [1, 124, 1, 1] 71,424
│ └─GroupNorm: 2-14 [1, 124, 1, 1] 248
│ └─Conv2d: 2-15 [1, 64, 1, 1] 71,424
│ └─GroupNorm: 2-16 [1, 64, 1, 1] 128
│ └─Dropout: 2-17 [1, 64] --
│ └─Linear: 2-18 [1, 64] 4,096
│ └─Linear: 2-19 [1, 11] 704
│ └─Linear: 2-20 [1, 2] 128
│ └─Linear: 2-21 [1, 1] 64
│ └─Linear: 2-22 [1, 1] 64
│ └─Linear: 2-23 [1, 1] 64
===================================================================================================================
Total params: 507,363,040
Trainable params: 507,363,040
Non-trainable params: 0
Total mult-adds (Units.GIGABYTES): 44.38
===================================================================================================================
Input size (MB): 48.38
Forward/backward pass size (MB): 8965.26
Params size (MB): 497.25
Estimated Total Size (MB): 9510.90
===================================================================================================================