Abstract
Musculoskeletal disorders are a major healthcare challenge around the world. We investigate the utility of convolutional neural networks (CNNs) in performing generalized abnormality detection on lower extremity radiographs. We also explore the effect of pretraining, dataset size and model architecture on model performance to provide recommendations for future deep learning analyses on extremity radiographs, especially when access to large datasets is challenging. We collected a large dataset of 93,455 lower extremity radiographs of multiple body parts, with each exam labelled as normal or abnormal. A 161-layer densely connected, pretrained CNN achieved an AUC-ROC of 0.880 (sensitivityâ=â0.714, specificityâ=â0.961) on this abnormality classification task. Our findings show that a single CNN model can be effectively utilized for the identification of diverse abnormalities in highly variable radiographs of multiple body parts, a result that holds potential for improving patient triage and assisting with diagnostics in resource-limited settings.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 /Â 30Â days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
We are releasing our de-identified test set as part of this manuscript. This dataset includes radiographs from 182 patients and demonstrates class balance across normal and abnormal labels as well as the four types of lower extremity (foot, hip, knee and ankle). In addition, two board-certified radiologists manually refined all labels, which guarantees a high level of accuracy. The dataset is available at https://aimi.stanford.edu/lera-lower-extremity-radiographs-2.
Code availability
Our deep learning training framework is available at: https://github.com/maya124/MSK-LE.
References
Yelin, E., Weinstein, S. & King, T. The burden of musculoskeletal diseases in the United States. Semin. Arthritis Rheum. 46, 259â60. (2016).
Amin, S., Achenbach, S. J., Atkinson, E. J., Khosla, S. & Melton, L. J. III Trends in fracture incidence: a population-based study over 20 years. J. Bone Miner. Res. 29, 581â589 (2014).
Gyftopoulos, S. et al. Changing musculoskeletal extremity imaging utilization from 1994 through 2013: a Medicare beneficiary perspective. Am. J. Roentgenol. 209, 1103â1109 (2017).
Lee, C. S., Nagy, P. G., Weaver, S. J. & Newman-Toker, D. E. Cognitive and system factors contributing to diagnostic errors in radiology. Am. J. Roentgenol. 201, 611â617 (2013).
Bhargavan, M., Kaye, A. H., Forman, H. P. & Sunshine, J. H. Workload of radiologists in United States in 2006â2007 and trends since 1991â1992. Radiology 252, 458â467 (2009).
Rajpurkar, P. et al. MURA: large dataset for abnormality detection in musculoskeletal radiographs. Preprint at https://arxiv.org/abs/1712.06957 (2017).
Rajpurkar, P. et al. Deep learning for chest radiograph diagnosis: a retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med. 15, e1002686 (2018).
Thian, Y. L. et al. Convolutional neural networks for automated fracture detection and localization on wrist radiographs. Radiology: Artificial Intelligence 1, e180001 (2019).
Huh, M., Agrawal, P. & Efros, A. A. What makes ImageNet good for transfer learning? Preprint at https://arxiv.org/abs/1608.08614 (2016).
Rajpurkar, P. et al. CheXNet: radiologist-level pneumonia detection on chest X-rays with deep learning. Preprint at https://arxiv.org/abs/1711.05225 (2017).
Larson, D. B. et al. Performance of a deep-learning neural network model in assessing skeletal maturity on pediatric hand radiographs. Radiology. 287, 313â22. (2018).
Antony, J., McGuinness, K., OâConnor, N. E. & Moran K. Quantifying radiographic knee osteoarthritis severity using deep convolutional neural networks. In Proceedings of the International Conference on Pattern Recognition 1195â1200 (2017).
Bi, L., Kim, J., Kumar, A. & Feng, D. Automatic liver lesion detection using cascaded deep residual networks. Preprint at https://arxiv.org/abs/1704.02703 (2017).
Zhang, R. et al. Automatic detection and classification of colorectal polyps by transferring low-level CNN features from nonmedical domain. IEEE J. Biomed. Health Inform. 21, 41â47 (2017).
Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316, 2402â2410 (2016).
Greenspan, H. et al. Guest editorial deep learning in medical imaging: overview and future promise of an exciting new technique. IEEE Trans. Med. Imaging 35, 1153â1159 (2016).
Kermany, D. S. et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172, 1122â1131 (2018).
Yan, C. et al. Weakly supervised deep learning for thoracic disease classification and localization on chest X-rays. Preprint at https://arxiv.org/abs/1807.06067 (2018).
Bar, Y. et al. Chest pathology detection using deep learning with non-medical training. In Proceedings of the International Symposium on Biomedical Imaging 294â297 (2015).
Olczak, J. et al. Artificial intelligence for analyzing orthopedic trauma radiographs: deep learning algorithmsâare they on par with humans for diagnosing fractures? Acta Orthop. 88, 581â586 (2017).
Lindsey, R. et al. Deep neural network improves fracture detection by clinicians. Proc. Natl Acad. Sci. USA 115, 11591â11596 (2018).
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A. & Torralba, A. Learning deep features for discriminative localization. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2921â2929 (IEEE, 2016).
Chartrand, G. et al. Deep learning: a primer for radiologists. Radiographics 37, 2113â31. (2017).
Yosinski, J., Clune, J., Bengio, Y. & Lipson H. How transferable are features in deep neural networks? In Proceedings of the 27th International Neural Information Processing Systems Conference 3320â3328 (MIT Press, 2014).
Dunnmon, J. A. et al. Assessment of convolutional neural networks for automated classification of chest radiographs. Radiology. 290, 537â544 (2019).
Gale, W., Oakden-Rayner, L., Carneiro, G., Bradley, A. P. & Palmer, L. J. Detecting hip fractures with radiologist-level performance using deep neural networks. Preprint at https://arxiv.org/abs/1711.06504 (2017).
Krupinski, E. A., Berbaum, K. S., Caldwell, R. T., Schartz, K. M. & Kim, J. Long radiology workdays reduce detection and accommodation accuracy. J. Am. Coll. Radiol. 7, 698â704 (2010).
Russakovsky, O. et al. ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115, 211â252 (2015).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 770â778 (IEEE, 2016).
Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 4700â4708 (IEEE, 2017).
He, K., Zhang, X., Ren, S. & Sun J. Delving deep into rectifiers. Surpassing human-level performance on ImageNet classification. In Proceedings of the International Conference on Computer Vision 1026â1034 (2015).
DeLong, E. R., DeLong, D. M. & Clarke-Pearson, D. L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 44, 837â845 (1988).
Acknowledgements
This study was supported by the Stanford Center for Artificial Intelligence in Medicine and Imaging (AIMI). The research reported in this publication was supported by the National Library of Medicine of the National Institutes of Health under award no. R01LM012966 and Stanford Child Health Research Institute (Stanford NIH-NCATS-CTSA grant #UL1 TR001085). This research used data or services provided by STARR (STAnford medicine Research data Repository) a clinical data warehouse made possible by the Stanford School of Medicine Research Office.
Author information
Authors and Affiliations
Contributions
All authors contributed extensively to this work. M.V., M.L. and R.G. designed the methodology and algorithms, implemented models, analysed results and wrote the manuscript. B.N.P. and M.P.L. oversaw the entire project and helped with study design, methodology development and manuscript writing. N.K. and P.R. provided technical advice and manuscript feedback. J.D. and J.L. contributed to statistical analyses and writing the manuscript. C.B. and K.S. assisted with data collection and labelling. L.F.-F. provided resources and advice.
Corresponding author
Ethics declarations
Competing interests
There was no industry support or other funding for this work. There are no conflicts of interests that pertain specifically to this work. However, some of the authors are consultants for medical industry. M.P.L. is supported by the National Library of Medicine of the NIH (R01LM012966). B.N.P. has grant support from GE. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or GE. M.P.L.âs activities not related to this Article include positions as shareholder and advisory board member for Segmed Inc., Nines.ai and Bunker Hill. M.V., R.G., M.L., N.K., P.R., J.L. and K.S. are not employees or consultants for industry and had control of the data and the analysis.
Additional information
Publisherâs note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
About this article
Cite this article
Varma, M., Lu, M., Gardner, R. et al. Automated abnormality detection in lower extremity radiographs using deep learning. Nat Mach Intell 1, 578â583 (2019). https://doi.org/10.1038/s42256-019-0126-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s42256-019-0126-0