Label Documentation for Clara CXR Classification Model

Our team is currently exploring the model that is provided in NGC here:

After consulting with some radiologists in our team, they are still confused about the meaning of each of the labels that are used in the model itself

‘Nodule’, ‘Mass’, ‘Distortion of Pulmonary Architecture’, ‘Pleural Based Mass’, ‘Granuloma’, ‘Fluid in Pleural Space’, ‘Right Hilar Abnormality’, ‘Left Hilar Abnormality’, ‘Major Atelectasis’, ‘Infiltrate’, ‘Scarring’, ‘Pleural Fibrosis’, ‘Bone/Soft Tissue Lesion’, ‘Cardiac Abnormality’, ‘COPD’

And so my questions are:

(1) Are there any existing documentation that explains what each of the label means, medically? Because sometimes terms like “distortion of Pulmonary Architecture” is quite different between radiologists. So to the next question,

(2) Are there any standards to how the data is labeled? Are there guidelines to what kind of CXR image is defined as each of the labels presented above?

We have not been able to find the documentation of the dataset on the PLCO website, though it would be great if there actually exists such documentation.

The categories of labels are defined according to the date dictionary posted on the PLCO website and duplicated concepts are merged.

The “medical documentation” you mention should be part of the PLCO screening trial docs (a guide for the radiologist to fill in the exam form). Unfortunately, it can not be found on the PLCO website. I will suggest to contact PLCO organizer for more details.