DS4.0 TX2 yoloV3:Number of unused weights left : 13824

hello,
I’m using DS4.0 to run slimyolov3.

there is an error about “Number of unused weights left : 13824”

Why it can cause this error,how to solve it?

(107) conv-linear     157 x  38 x  38      45 x  38 x  38    11031021
(108) yolo             45 x  38 x  38      45 x  38 x  38    11031021
(109) route                  -             23 x  38 x  38    11031021
(110) conv-bn-leaky    23 x  38 x  38      50 x  38 x  38    11032371
(111) upsample         50 x  38 x  38      50 x  76 x  76        - 
(112) route                  -            294 x  76 x  76    11032371
(113) conv-bn-leaky   294 x  76 x  76      33 x  76 x  76    11042205
(114) conv-bn-leaky    33 x  76 x  76      57 x  76 x  76    11059362
(115) conv-bn-leaky    57 x  76 x  76      15 x  76 x  76    11060277
(116) maxpool          15 x  76 x  76      15 x  76 x  76    11060277
(117) route                  -             15 x  76 x  76    11060277
(118) maxpool          15 x  76 x  76      15 x  76 x  76    11060277
(119) route                  -             15 x  76 x  76    11060277
(120) maxpool          15 x  76 x  76      15 x  76 x  76    11060277
(121) route                  -             30 x  76 x  76    11060277
(122) conv-bn-leaky    30 x  76 x  76      44 x  76 x  76    11072333
(123) conv-bn-leaky    44 x  76 x  76      41 x  76 x  76    11074301
(124) conv-bn-leaky    41 x  76 x  76      63 x  76 x  76    11097800
(125) conv-linear      63 x  76 x  76      45 x  76 x  76    11100680
(126) yolo             45 x  76 x  76      45 x  76 x  76    11100680
Number of unused weights left : 13824
deepstream-app: yolo.cpp:361: nvinfer1::INetworkDefinition* Yolo::createYoloNetwork(std::vector<float>&, std::vector<nvinfer1::Weights>&): Assertion `0' failed.
Aborted

.cfg file

[net]
# Testing
batch=1
subdivisions=1
# Training
# batch=64
# subdivisions=16
width=608
height=608
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.001
burn_in=1000
max_batches = 60200
policy=steps
steps=35000,50000
scales=.1,.1

[convolutional]
batch_normalize=1
filters=9
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=50
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=25
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=50
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=117
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=16
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=117
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=40
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=117
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=244
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=43
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=244
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=71
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=244
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=74
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=244
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=63
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=244
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=48
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=244
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=56
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=244
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=60
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=244
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=43
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=244
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=457
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=89
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=457
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=74
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=457
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=69
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=457
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=87
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=457
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=72
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=457
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=63
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=457
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=38
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=457
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=65
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=457
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=864
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=91
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=864
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=53
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=864
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=52
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=864
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=63
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=864
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=11
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=33
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=12
size=1
stride=1
pad=1
activation=leaky

[maxpool]
stride=1
size=5

[route]
layers=-2

[maxpool]
stride=1
size=9

[route]
layers=-4

[maxpool]
stride=1
size=13

[route]
layers=-1,-3,-5,-6

[convolutional]
batch_normalize=1
filters=18
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=20
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=12
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=120
size=3
stride=1
pad=1
activation=leaky

[convolutional]
filters=45
size=1
stride=1
pad=1
activation=linear

[yolo]
mask = 6,7,8
anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326
classes = 10
num = 9
jitter = .3
ignore_thresh = .7
truth_thresh = 1
random = 1

[route]
layers=-4

[convolutional]
batch_normalize=1
filters=72
size=1
stride=1
pad=1
activation=leaky

[upsample]
stride=2

[route]
layers=-1, 61

[convolutional]
batch_normalize=1
filters=43
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=18
size=3
stride=1
pad=1
activation=leaky

[maxpool]
stride=1
size=5

[route]
layers=-2

[maxpool]
stride=1
size=9

[route]
layers=-4

[maxpool]
stride=1
size=13

[route]
layers=-1,-3,-5,-6

[convolutional]
batch_normalize=1
filters=42
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=54
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=23
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=157
size=3
stride=1
pad=1
activation=leaky

[convolutional]
filters=45
size=1
stride=1
pad=1
activation=linear

[yolo]
mask = 3,4,5
anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326
classes = 10
num = 9
jitter = .3
ignore_thresh = .7
truth_thresh = 1
random = 1

[route]
layers=-4

[convolutional]
batch_normalize=1
filters=50
size=1
stride=1
pad=1
activation=leaky

[upsample]
stride=2

[route]
layers=-1, 36

[convolutional]
batch_normalize=1
filters=33
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=57
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=15
size=1
stride=1
pad=1
activation=leaky

[maxpool]
stride=1
size=5

[route]
layers=-2

[maxpool]
stride=1
size=9

[route]
layers=-4

[maxpool]
stride=1
size=13

[route]
layers=-1,-3,-5,-6

[convolutional]
batch_normalize=1
filters=44
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=41
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=63
size=3
stride=1
pad=1
activation=leaky

[convolutional]
filters=45
size=1
stride=1
pad=1
activation=linear

[yolo]
mask = 0,1,2
anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326
classes = 10
num = 9
jitter = .3
ignore_thresh = .7
truth_thresh = 1
random = 1

I think it’s caused by “[route] layers=-1,-3,-5,-6” ,the yolo.cpp only support two value after [router],but this config have ‘=-1,-3,-5,-6’ four values ,so the router layer calculate error .

for example:
(84) route] layers=-1,-3,-5,-6,
so (84) route should added by 83 layer + 81 layer + 79 layer + 78 layer =48 x 19 x 19,but actually the value is 24 x 19 x 19;
detail please see the code ,only calculate index1 and index2, I think this is a issue point ,How to solve it?

(67)  conv-bn-leaky   864 x  19 x  19      53 x  19 x  19    9483089
(68)  conv-bn-leaky    53 x  19 x  19     864 x  19 x  19    9898673
(69)  skip            864 x  19 x  19     864 x  19 x  19        - 
(70)  conv-bn-leaky   864 x  19 x  19      52 x  19 x  19    9943809
(71)  conv-bn-leaky    52 x  19 x  19     864 x  19 x  19    10351617
(72)  skip            864 x  19 x  19     864 x  19 x  19        - 
(73)  conv-bn-leaky   864 x  19 x  19      63 x  19 x  19    10406301
(74)  conv-bn-leaky    63 x  19 x  19     864 x  19 x  19    10899645
(75)  skip            864 x  19 x  19     864 x  19 x  19        - 
(76)  conv-bn-leaky   864 x  19 x  19      11 x  19 x  19    10909193
(77)  conv-bn-leaky    11 x  19 x  19      33 x  19 x  19    10912592
(78)  conv-bn-leaky    33 x  19 x  19      12 x  19 x  19    10913036
(79)  maxpool          12 x  19 x  19      12 x  19 x  19    10913036
(80)  route                  -             12 x  19 x  19    10913036
(81)  maxpool          12 x  19 x  19      12 x  19 x  19    10913036
(82)  route                  -             12 x  19 x  19    10913036
(83)  maxpool          12 x  19 x  19      12 x  19 x  19    10913036
(84)  route                  -             24 x  19 x  19    10913036
// route layers (single or concat)
        else if (m_configBlocks.at(i).at("type") == "route")
        {
            size_t found = m_configBlocks.at(i).at("layers").find(",");
            if (found != std::string::npos)
            {
                int idx1 = std::stoi(trim(m_configBlocks.at(i).at("layers").substr(0, found)));
                int idx2 = std::stoi(trim(m_configBlocks.at(i).at("layers").substr(found + 1)));
                if (idx1 < 0)
                {
                    idx1 = tensorOutputs.size() + idx1;
                }
                if (idx2 < 0)
                {
                    idx2 = tensorOutputs.size() + idx2;
                }
                assert(idx1 < static_cast<int>(tensorOutputs.size()) && idx1 >= 0);
                assert(idx2 < static_cast<int>(tensorOutputs.size()) && idx2 >= 0);
                nvinfer1::ITensor** concatInputs
                    = reinterpret_cast<nvinfer1::ITensor**>(malloc(sizeof(nvinfer1::ITensor*) * 2));
                concatInputs[0] = tensorOutputs[idx1];
                concatInputs[1] = tensorOutputs[idx2];
                nvinfer1::IConcatenationLayer* concat
                    = network->addConcatenation(concatInputs, 2);
                assert(concat != nullptr);
                std::string concatLayerName = "route_" + std::to_string(i - 1);
                concat->setName(concatLayerName.c_str());
                // concatenate along the channel dimension
                concat->setAxis(0);
                previous = concat->getOutput(0);
                assert(previous != nullptr);
                std::string outputVol = dimsToString(previous->getDimensions());
                // set the output volume depth
                channels
                    = getNumChannels(tensorOutputs[idx1]) + getNumChannels(tensorOutputs[idx2]);
                tensorOutputs.push_back(concat->getOutput(0));
                printLayerInfo(layerIndex, "route", "        -", outputVol,
                               std::to_string(weightPtr));
            }
            else
            {
                int idx = std::stoi(trim(m_configBlocks.at(i).at("layers")));
                if (idx < 0)
                {
                    idx = tensorOutputs.size() + idx;
                }
                assert(idx < static_cast<int>(tensorOutputs.size()) && idx >= 0);
                previous = tensorOutputs[idx];
                assert(previous != nullptr);
                std::string outputVol = dimsToString(previous->getDimensions());
                // set the output volume depth
                channels = getNumChannels(tensorOutputs[idx]);
                tensorOutputs.push_back(tensorOutputs[idx]);
                printLayerInfo(layerIndex, "route", "        -", outputVol,
                               std::to_string(weightPtr));
            }

Yes, DS 4.0 only supports yolov2, yolov2-tiny, yolov3 and yolov3-tiny. You can modify the source code to concatenate all four layers as required by the slimyolo model. The “if” block right now is concatenating 2 tensors, you will have modify it to concatenate 4 tensors.

Hi,
I have added concatenate 4 tensors, what’s mean for “previous = concat->getOutput(0) and tensorOutputs.push_back(concat->getOutput(0))” ,the value default is 0 , do I need to change it?

Below is my code ,can you help me review it?

else if (m_configBlocks.at(i).at("type") == "route")

2.        {   

3.            int cont=0;

4.            int idx[6];

5.            int lac[6];

6.            size_t found = m_configBlocks.at(i).at("layers").find(",");

7.            if (found != std::string::npos)

8.            {

9.		size_t pos;

10.		int size=m_configBlocks.at(i).at("layers").size();

11.		for(int j=0; j<size; j++)

12.		  {

13.		    pos= m_configBlocks.at(i).at("layers").find(",",j);

14.		    if(pos!= std::string::npos)

15.		    {

16.			if(cont==0)

17.			{

18.		      		lac[cont]=0;

19.			}

20.

21.

22.                      idx[cont] = std::stoi(trim(m_configBlocks.at(i).at("layers").substr(lac[cont], pos)));

23.		      printf("cont-idx[%d]=%d \n",cont,idx[cont]);

24.		      cont=cont+1;

25.		      lac[cont]=pos+1;

26.		      //std::string s= m_configBlocks.at(i).at("layers").substr(i,pos-i);

27.		      //result.push_back(s);

28.		      j=pos+1;

29. 		      printf("pos=%d \n",pos);

30.		    }

31.		else{

32.			idx[cont] = std::stoi(trim(m_configBlocks.at(i).at("layers").substr(lac[cont])));

33.		      printf("cont-idx[%d]=%d \n",cont,idx[cont]);

34.		      cont=cont+1;

35.			break;

36.			}

37.		  }

38.

39.

40.

41.                //int idx1 = std::stoi(trim(m_configBlocks.at(i).at("layers").substr(0, found)));

42.               // int idx2 = std::stoi(trim(m_configBlocks.at(i).at("layers").substr(found + 1)));

43.

44.

45.		//int idx3 = std::stoi(trim(m_configBlocks.at(i).at("layers").substr(found + 2)));

46.		//int idx4 = std::stoi(trim(m_configBlocks.at(i).at("layers").substr(found + 3)));

47.		//int idx3=-5;

48.		//int idx4=-6;

49.

50.		for(int j=0; j<cont; j++)

51.		{

52.

53.			printf("idx[%d]=%d \n",j,idx[j]);

54.		     if (idx[j] < 0)

55.		        {

56.		            idx[j] = tensorOutputs.size() + idx[j];

57.		        }

58.			printf("tensorOutputs.size=%d \n",tensorOutputs.size());

59.                	assert(idx[j] < static_cast<int>(tensorOutputs.size()) && idx[j] >= 0);

60.			printf("idx[%d]=%d \n",j,idx[j]);

61.		//printLayerInfo(idx1, idx2);

62.		}

63./*

64.                if (idx1 < 0)

65.                {

66.                    idx1 = tensorOutputs.size() + idx1;

67.                }

68.                if (idx2 < 0)

69.                {

70.                    idx2 = tensorOutputs.size() + idx2;

71.                }

72.                if (idx3 < 0)

73.                {

74.                    idx3 = tensorOutputs.size() + idx3;

75.                }

76.                if (idx4 < 0)

77.                {

78.                    idx4 = tensorOutputs.size() + idx4;

79.                }

80.                assert(idx1 < static_cast<int>(tensorOutputs.size()) && idx1 >= 0);

81.                assert(idx2 < static_cast<int>(tensorOutputs.size()) && idx2 >= 0);

82.                assert(idx3 < static_cast<int>(tensorOutputs.size()) && idx3 >= 0);

83.                assert(idx4 < static_cast<int>(tensorOutputs.size()) && idx4 >= 0);

84.*/

85.                nvinfer1::ITensor** concatInputs

86.                    = reinterpret_cast<nvinfer1::ITensor**>(malloc(sizeof(nvinfer1::ITensor*) * cont));

87.		printf("cont=%d \n",cont);

88.		for(int j=0; j<cont; j++)

89.		{

90.                	concatInputs[j] = tensorOutputs[idx[j]];

91.		}

92.

93. //               concatInputs[0] = tensorOutputs[idx1];

94. //               concatInputs[1] = tensorOutputs[idx2];

95. //               concatInputs[2] = tensorOutputs[idx3];

96.  //              concatInputs[3] = tensorOutputs[idx4];

97.                nvinfer1::IConcatenationLayer* concat

98.                    = network->addConcatenation(concatInputs, cont);

99.                assert(concat != nullptr);

100.                std::string concatLayerName = "route_" + std::to_string(i - 1);

101.                concat->setName(concatLayerName.c_str());

102.                // concatenate along the channel dimension

103.                concat->setAxis(0);

104.                previous = concat->getOutput(0);

105.                assert(previous != nullptr);

106.                std::string outputVol = dimsToString(previous->getDimensions());

107.                // set the output volume depth

108.

109.		channels= getNumChannels(tensorOutputs[idx[0]]);

110.		for(int j=1; j<cont; j++)

111.		{

112.			printf("+channels=%d \n",channels);

113.			channels=channels+getNumChannels(tensorOutputs[idx[j]]);

114.		}

115.			printf("+channels=%d \n",channels);

116.               // channels

117.                  //  = getNumChannels(tensorOutputs[idx1]) + getNumChannels(tensorOutputs[idx2]) + getNumChannels(tensorOutputs[idx3]) + getNumChannels(tensorOutputs[idx4]);

118.

119.                tensorOutputs.push_back(concat->getOutput(0));

120.                printLayerInfo(layerIndex, "route", "        -", outputVol,

121.                               std::to_string(weightPtr));

122.            }

thread continued here - https://devtalk.nvidia.com/default/topic/1063207/deepstream-sdk/ds4-0-tx2-precision-detection-decreased-a-lot/