by Onno M. Mets, Ewoud J. Smit, Firdaus A. A. Mohamed Hoesein, Hester A. Gietema, Reinoud P. H. Bokkers, Mohamed Attrach, Saskia van Amelsvoort-van de Vorst, Ernst Th Scholten, Constantinus F. M. Buckens, Matthijs Oudkerk, Jan-Willem J. Lammers, Mathias Prokop, Pim A. de Jong
Incidental CT findings may provide an opportunity for early detection of chronic obstructive pulmonary disease (COPD), which may prove important in CT-based lung cancer screening setting. We aimed to determine the diagnostic performance of human observers to visually evaluate COPD presence on CT images, in comparison to automated evaluation using quantitative CT measures. Methods
This study was approved by the Dutch Ministry of Health and the institutional review board. All participants provided written informed consent. We studied 266 heavy smokers enrolled in a lung cancer screening trial. All subjects underwent volumetric inspiratory and expiratory chest computed tomography (CT). Pulmonary function testing was used as the reference standard for COPD. We evaluated the diagnostic performance of eight observers and one automated model based on quantitative CT measures. Results
The prevalence of COPD in the study population was 44% (118/266), of whom 62% (73/118) had mild disease. The diagnostic accuracy was 74.1% in the automated evaluation, and ranged between 58.3% and 74.3% for the visual evaluation of CT images. The positive predictive value was 74.3% in the automated evaluation, and ranged between 52.9% and 74.7% for the visual evaluation. Interobserver variation was substantial, even within the subgroup of experienced observers. Agreement within observers yielded kappa values between 0.28 and 0.68, regardless of the level of expertise. The agreement between the observers and the automated CT model showed kappa values of 0.12–0.35. Conclusions
Visual evaluation of COPD presence on chest CT images provides at best modest accuracy and is associated with substantial interobserver variation. Automated evaluation of COPD subjects using quantitative CT measures appears superior to visual evaluation by human observers.