Weird xml issue with train_ssd jetson detection trainer

I recently generated a new data set but have a strange issue.

only some, and I don’t know how many yet, xml files throw errors.

The weird thing is all I have to do to fix it is open the xml in notepad++ and save and that is it!

unfortuntely there are 14k files and I have no idea how many I have to do this to, so far I have done 100 lol

But basically I am just running it, waiting for error, open, save, close, run, repeat!

I would love if someone could tell me what the difference is between these two xml files? I could fix this then en mass with a script…

first xml produces the following error:

Traceback (most recent call last):
          File "train_ssd.py", line 368, in <module>
            device=DEVICE, debug_steps=args.debug_steps, epoch=epoch)
          File "train_ssd.py", line 132, in train
            for i, data in enumerate(loader):
          File "C:\Users\muayt\.virtualenvs\jetson_dev-V1CBiXeI\lib\site-packages\torch\utils\data\dataloader.py", line 517, in __next__
            data = self._next_data()
          File "C:\Users\muayt\.virtualenvs\jetson_dev-V1CBiXeI\lib\site-packages\torch\utils\data\dataloader.py", line 557, in _next_data
            data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
          File "C:\Users\muayt\.virtualenvs\jetson_dev-V1CBiXeI\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch
            data = [self.dataset[idx] for idx in possibly_batched_index]
          File "C:\Users\muayt\.virtualenvs\jetson_dev-V1CBiXeI\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp>
            data = [self.dataset[idx] for idx in possibly_batched_index]
          File "C:\Users\muayt\.virtualenvs\jetson_dev-V1CBiXeI\lib\site-packages\torch\utils\data\dataset.py", line 219, in __getitem__
            return self.datasets[dataset_idx][sample_idx]
          File "F:\Work\Highlight\TF\jetson_dev\jetson-inference-master\python\training\detection\vision\datasets\voc_dataset.py", line 58, in __getitem__
            boxes, labels, is_difficult = self._get_annotation(image_id)
          File "F:\Work\Highlight\TF\jetson_dev\jetson-inference- master\python\training\detection\vision\datasets\voc_dataset.py", line 118, in _get_annotation
            class_name = object.find('name').text.strip() #.lower().strip()
        AttributeError: 'NoneType' object has no attribute 'strip'

second xml file loads fine…

file1 (error): https://file.io/FvFbS4CkXtjb
fil 2 (no error): https://file.io/dea0jt3fwCxR

thanks :)

Hi,

The error occurs since object.find('name').text returns None.

We check the two files you uploaded and doesn’t find any difference.
Based on your explanation, would you mind to check if the file permission changes after modifying with notepad ++?

Thanks.

Also, since you mention notepad++ and since notepad++ is a windows program, it makes me think of the line endings. You can check if the line endings are in Unix format. In notepad++, this is in the Edit->EOL Conversion menu:

  • Windows (CR LF)
  • Unix (LF)
  • Mac (CR)

To script this in Ubuntu, you can install sudo apt-get install dos2unix and use the dos2unix utility.