TX2 DS4.0 loadeight

Hello,
I found the code “removed 5 int32 bytes” in loadweight file, Why it removed 5 bytes for yolov3? If I use yolov3-spp newwork ,what the value should be?

std::vector<float> loadWeights(const std::string weightsFilePath, const std::string& networkType)
{
    assert(fileExists(weightsFilePath));
    std::cout << "Loading pre-trained weights..." << std::endl;
    std::ifstream file(weightsFilePath, std::ios_base::binary);
    assert(file.good());
    std::string line;

    if (networkType == "yolov2")
    {
        // Remove 4 int32 bytes of data from the stream belonging to the header
        file.ignore(4 * 4);
    }
    else if ((networkType == "yolov3") || (networkType == "yolov3-tiny")
             || (networkType == "yolov2-tiny"))
    {
        // Remove 5 int32 bytes of data from the stream belonging to the header
        file.ignore(4 * 5);
    }
    else
    {
        std::cout << "Invalid network type" << std::endl;
        assert(0);
    }

I have known the reson ,it’s due to the file header value.

But I used yolo-spp3 to do test,the precision detection decreased a lot.
So confused for me,It looks like related with spp module parsing ,because this is the only different point compared with yolov3. Do you know the possible reson? or give me a reference case of yolo-spp?

SPP

[maxpool]
stride=1
size=5

[route]
layers=-2

[maxpool]
stride=1
size=9

[route]
layers=-4

[maxpool]
stride=1
size=13

[route]
layers=-1,-3,-5,-6

End SPP

.
.cfg file

[net]

2.# Testing

3.batch=1

4.subdivisions=1

5.# Training

6.# batch=64

7.# subdivisions=16

8.width=608

9.height=608

10.channels=3

11.momentum=0.9

12.decay=0.0005

13.angle=0

14.saturation = 1.5

15.exposure = 1.5

16.hue=.1

17.

18.learning_rate=0.001

19.burn_in=1000

20.max_batches = 120200

21.policy=steps

22.steps=70000,100000

23.scales=.1,.1

24.

25.[convolutional]

26.batch_normalize=1

27.filters=32

28.size=3

29.stride=1

30.pad=1

31.activation=leaky

32.

33.# Downsample

34.

35.[convolutional]

36.batch_normalize=1

37.filters=64

38.size=3

39.stride=2

40.pad=1

41.activation=leaky

42.

43.[convolutional]

44.batch_normalize=1

45.filters=32

46.size=1

47.stride=1

48.pad=1

49.activation=leaky

50.

51.[convolutional]

52.batch_normalize=1

53.filters=64

54.size=3

55.stride=1

56.pad=1

57.activation=leaky

58.

59.[shortcut]

60.from=-3

61.activation=linear

62.

63.# Downsample

64.

65.[convolutional]

66.batch_normalize=1

67.filters=128

68.size=3

69.stride=2

70.pad=1

71.activation=leaky

72.

73.[convolutional]

74.batch_normalize=1

75.filters=64

76.size=1

77.stride=1

78.pad=1

79.activation=leaky

80.

81.[convolutional]

82.batch_normalize=1

83.filters=128

84.size=3

85.stride=1

86.pad=1

87.activation=leaky

88.

89.[shortcut]

90.from=-3

91.activation=linear

92.

93.[convolutional]

94.batch_normalize=1

95.filters=64

96.size=1

97.stride=1

98.pad=1

99.activation=leaky

100.

101.[convolutional]

102.batch_normalize=1

103.filters=128

104.size=3

105.stride=1

106.pad=1

107.activation=leaky

108.

109.[shortcut]

110.from=-3

111.activation=linear

112.

113.# Downsample

114.

115.[convolutional]

116.batch_normalize=1

117.filters=256

118.size=3

119.stride=2

120.pad=1

121.activation=leaky

122.

123.[convolutional]

124.batch_normalize=1

125.filters=128

126.size=1

127.stride=1

128.pad=1

129.activation=leaky

130.

131.[convolutional]

132.batch_normalize=1

133.filters=256

134.size=3

135.stride=1

136.pad=1

137.activation=leaky

138.

139.[shortcut]

140.from=-3

141.activation=linear

142.

143.[convolutional]

144.batch_normalize=1

145.filters=128

146.size=1

147.stride=1

148.pad=1

149.activation=leaky

150.

151.[convolutional]

152.batch_normalize=1

153.filters=256

154.size=3

155.stride=1

156.pad=1

157.activation=leaky

158.

159.[shortcut]

160.from=-3

161.activation=linear

162.

163.[convolutional]

164.batch_normalize=1

165.filters=128

166.size=1

167.stride=1

168.pad=1

169.activation=leaky

170.

171.[convolutional]

172.batch_normalize=1

173.filters=256

174.size=3

175.stride=1

176.pad=1

177.activation=leaky

178.

179.[shortcut]

180.from=-3

181.activation=linear

182.

183.[convolutional]

184.batch_normalize=1

185.filters=128

186.size=1

187.stride=1

188.pad=1

189.activation=leaky

190.

191.[convolutional]

192.batch_normalize=1

193.filters=256

194.size=3

195.stride=1

196.pad=1

197.activation=leaky

198.

199.[shortcut]

200.from=-3

201.activation=linear

202.

203.

204.[convolutional]

205.batch_normalize=1

206.filters=128

207.size=1

208.stride=1

209.pad=1

210.activation=leaky

211.

212.[convolutional]

213.batch_normalize=1

214.filters=256

215.size=3

216.stride=1

217.pad=1

218.activation=leaky

219.

220.[shortcut]

221.from=-3

222.activation=linear

223.

224.[convolutional]

225.batch_normalize=1

226.filters=128

227.size=1

228.stride=1

229.pad=1

230.activation=leaky

231.

232.[convolutional]

233.batch_normalize=1

234.filters=256

235.size=3

236.stride=1

237.pad=1

238.activation=leaky

239.

240.[shortcut]

241.from=-3

242.activation=linear

243.

244.[convolutional]

245.batch_normalize=1

246.filters=128

247.size=1

248.stride=1

249.pad=1

250.activation=leaky

251.

252.[convolutional]

253.batch_normalize=1

254.filters=256

255.size=3

256.stride=1

257.pad=1

258.activation=leaky

259.

260.[shortcut]

261.from=-3

262.activation=linear

263.

264.[convolutional]

265.batch_normalize=1

266.filters=128

267.size=1

268.stride=1

269.pad=1

270.activation=leaky

271.

272.[convolutional]

273.batch_normalize=1

274.filters=256

275.size=3

276.stride=1

277.pad=1

278.activation=leaky

279.

280.[shortcut]

281.from=-3

282.activation=linear

283.

284.# Downsample

285.

286.[convolutional]

287.batch_normalize=1

288.filters=512

289.size=3

290.stride=2

291.pad=1

292.activation=leaky

293.

294.[convolutional]

295.batch_normalize=1

296.filters=256

297.size=1

298.stride=1

299.pad=1

300.activation=leaky

301.

302.[convolutional]

303.batch_normalize=1

304.filters=512

305.size=3

306.stride=1

307.pad=1

308.activation=leaky

309.

310.[shortcut]

311.from=-3

312.activation=linear

313.

314.

315.[convolutional]

316.batch_normalize=1

317.filters=256

318.size=1

319.stride=1

320.pad=1

321.activation=leaky

322.

323.[convolutional]

324.batch_normalize=1

325.filters=512

326.size=3

327.stride=1

328.pad=1

329.activation=leaky

330.

331.[shortcut]

332.from=-3

333.activation=linear

334.

335.

336.[convolutional]

337.batch_normalize=1

338.filters=256

339.size=1

340.stride=1

341.pad=1

342.activation=leaky

343.

344.[convolutional]

345.batch_normalize=1

346.filters=512

347.size=3

348.stride=1

349.pad=1

350.activation=leaky

351.

352.[shortcut]

353.from=-3

354.activation=linear

355.

356.

357.[convolutional]

358.batch_normalize=1

359.filters=256

360.size=1

361.stride=1

362.pad=1

363.activation=leaky

364.

365.[convolutional]

366.batch_normalize=1

367.filters=512

368.size=3

369.stride=1

370.pad=1

371.activation=leaky

372.

373.[shortcut]

374.from=-3

375.activation=linear

376.

377.[convolutional]

378.batch_normalize=1

379.filters=256

380.size=1

381.stride=1

382.pad=1

383.activation=leaky

384.

385.[convolutional]

386.batch_normalize=1

387.filters=512

388.size=3

389.stride=1

390.pad=1

391.activation=leaky

392.

393.[shortcut]

394.from=-3

395.activation=linear

396.

397.

398.[convolutional]

399.batch_normalize=1

400.filters=256

401.size=1

402.stride=1

403.pad=1

404.activation=leaky

405.

406.[convolutional]

407.batch_normalize=1

408.filters=512

409.size=3

410.stride=1

411.pad=1

412.activation=leaky

413.

414.[shortcut]

415.from=-3

416.activation=linear

417.

418.

419.[convolutional]

420.batch_normalize=1

421.filters=256

422.size=1

423.stride=1

424.pad=1

425.activation=leaky

426.

427.[convolutional]

428.batch_normalize=1

429.filters=512

430.size=3

431.stride=1

432.pad=1

433.activation=leaky

434.

435.[shortcut]

436.from=-3

437.activation=linear

438.

439.[convolutional]

440.batch_normalize=1

441.filters=256

442.size=1

443.stride=1

444.pad=1

445.activation=leaky

446.

447.[convolutional]

448.batch_normalize=1

449.filters=512

450.size=3

451.stride=1

452.pad=1

453.activation=leaky

454.

455.[shortcut]

456.from=-3

457.activation=linear

458.

459.# Downsample

460.

461.[convolutional]

462.batch_normalize=1

463.filters=1024

464.size=3

465.stride=2

466.pad=1

467.activation=leaky

468.

469.[convolutional]

470.batch_normalize=1

471.filters=512

472.size=1

473.stride=1

474.pad=1

475.activation=leaky

476.

477.[convolutional]

478.batch_normalize=1

479.filters=1024

480.size=3

481.stride=1

482.pad=1

483.activation=leaky

484.

485.[shortcut]

486.from=-3

487.activation=linear

488.

489.[convolutional]

490.batch_normalize=1

491.filters=512

492.size=1

493.stride=1

494.pad=1

495.activation=leaky

496.

497.[convolutional]

498.batch_normalize=1

499.filters=1024

500.size=3

501.stride=1

502.pad=1

503.activation=leaky

504.

505.[shortcut]

506.from=-3

507.activation=linear

508.

509.[convolutional]

510.batch_normalize=1

511.filters=512

512.size=1

513.stride=1

514.pad=1

515.activation=leaky

516.

517.[convolutional]

518.batch_normalize=1

519.filters=1024

520.size=3

521.stride=1

522.pad=1

523.activation=leaky

524.

525.[shortcut]

526.from=-3

527.activation=linear

528.

529.[convolutional]

530.batch_normalize=1

531.filters=512

532.size=1

533.stride=1

534.pad=1

535.activation=leaky

536.

537.[convolutional]

538.batch_normalize=1

539.filters=1024

540.size=3

541.stride=1

542.pad=1

543.activation=leaky

544.

545.[shortcut]

546.from=-3

547.activation=linear

548.

549.######################

550.

551.[convolutional]

552.batch_normalize=1

553.filters=512

554.size=1

555.stride=1

556.pad=1

557.activation=leaky

558.

559.[convolutional]

560.batch_normalize=1

561.size=3

562.stride=1

563.pad=1

564.filters=1024

565.activation=leaky

566.

567.[convolutional]

568.batch_normalize=1

569.filters=512

570.size=1

571.stride=1

572.pad=1

573.activation=leaky

574.

575.### SPP ###

576.[maxpool]

577.stride=1

578.size=5

579.

580.[route]

581.layers=-2

582.

583.[maxpool]

584.stride=1

585.size=9

586.

587.[route]

588.layers=-4

589.

590.[maxpool]

591.stride=1

592.size=13

593.

594.[route]

595.layers=-1,-3,-5,-6

596.

597.### End SPP ###

598.

599.[convolutional]

600.batch_normalize=1

601.filters=512

602.size=1

603.stride=1

604.pad=1

605.activation=leaky

606.

607.

608.[convolutional]

609.batch_normalize=1

610.size=3

611.stride=1

612.pad=1

613.filters=1024

614.activation=leaky

615.

616.[convolutional]

617.batch_normalize=1

618.filters=512

619.size=1

620.stride=1

621.pad=1

622.activation=leaky

623.

624.[convolutional]

625.batch_normalize=1

626.size=3

627.stride=1

628.pad=1

629.filters=1024

630.activation=leaky

631.

632.

633.[convolutional]

634.size=1

635.stride=1

636.pad=1

637.filters=45

638.activation=linear

639.

640.

641.[yolo]

642.mask = 6,7,8

643.anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326

644.classes=10

645.num=9

646.jitter=.3

647.ignore_thresh = .7

648.truth_thresh = 1

649.random=1

650.

651.

652.[route]

653.layers = -4

654.

655.[convolutional]

656.batch_normalize=1

657.filters=256

658.size=1

659.stride=1

660.pad=1

661.activation=leaky

662.

663.[upsample]

664.stride=2

665.

666.[route]

667.layers = -1, 61

668.

669.

670.

671.[convolutional]

672.batch_normalize=1

673.filters=256

674.size=1

675.stride=1

676.pad=1

677.activation=leaky

678.

679.[convolutional]

680.batch_normalize=1

681.size=3

682.stride=1

683.pad=1

684.filters=512

685.activation=leaky

686.

687.### SPP ###

688.[maxpool]

689.stride=1

690.size=5

691.

692.[route]

693.layers=-2

694.

695.[maxpool]

696.stride=1

697.size=9

698.

699.[route]

700.layers=-4

701.

702.[maxpool]

703.stride=1

704.size=13

705.

706.[route]

707.layers=-1,-3,-5,-6

708.

709.### End SPP ###

710.

711.

712.[convolutional]

713.batch_normalize=1

714.filters=256

715.size=1

716.stride=1

717.pad=1

718.activation=leaky

719.

720.[convolutional]

721.batch_normalize=1

722.size=3

723.stride=1

724.pad=1

725.filters=512

726.activation=leaky

727.

728.[convolutional]

729.batch_normalize=1

730.filters=256

731.size=1

732.stride=1

733.pad=1

734.activation=leaky

735.

736.[convolutional]

737.batch_normalize=1

738.size=3

739.stride=1

740.pad=1

741.filters=512

742.activation=leaky

743.

744.[convolutional]

745.size=1

746.stride=1

747.pad=1

748.filters=45

749.activation=linear

750.

751.

752.[yolo]

753.mask = 3,4,5

754.anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326

755.classes=10

756.num=9

757.jitter=.3

758.ignore_thresh = .7

759.truth_thresh = 1

760.random=1

761.

762.

763.

764.[route]

765.layers = -4

766.

767.[convolutional]

768.batch_normalize=1

769.filters=128

770.size=1

771.stride=1

772.pad=1

773.activation=leaky

774.

775.[upsample]

776.stride=2

777.

778.[route]

779.layers = -1, 36

780.

781.

782.

783.[convolutional]

784.batch_normalize=1

785.filters=128

786.size=1

787.stride=1

788.pad=1

789.activation=leaky

790.

791.[convolutional]

792.batch_normalize=1

793.size=3

794.stride=1

795.pad=1

796.filters=256

797.activation=leaky

798.

799.[convolutional]

800.batch_normalize=1

801.filters=128

802.size=1

803.stride=1

804.pad=1

805.activation=leaky

806.

807.### SPP ###

808.[maxpool]

809.stride=1

810.size=5

811.

812.[route]

813.layers=-2

814.

815.[maxpool]

816.stride=1

817.size=9

818.

819.[route]

820.layers=-4

821.

822.[maxpool]

823.stride=1

824.size=13

825.

826.[route]

827.layers=-1,-3,-5,-6

828.

829.### End SPP ###

830.

831.[convolutional]

832.batch_normalize=1

833.size=3

834.stride=1

835.pad=1

836.filters=256

837.activation=leaky

838.

839.[convolutional]

840.batch_normalize=1

841.filters=128

842.size=1

843.stride=1

844.pad=1

845.activation=leaky

846.

847.[convolutional]

848.batch_normalize=1

849.size=3

850.stride=1

851.pad=1

852.filters=256

853.activation=leaky

854.

855.[convolutional]

856.size=1

857.stride=1

858.pad=1

859.filters=45

860.activation=linear

861.

862.

863.[yolo]

864.mask = 0,1,2

865.anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326

866.classes=10

867.num=9

868.jitter=.3

869.ignore_thresh = .7

870.truth_thresh = 1

871.random=1

Hi,

The removal is for the header data.

Sorry that the yolo-spp3 is not in our supported model list yet.
It’s recommended to check the correct header size of the model and apply the corresponding update first.

Thanks.