Primary Knee| Volume 35, ISSUE 9, P2423-2428, September 2020

Can a Convolutional Neural Network Classify Knee Osteoarthritis on Plain Radiographs as Accurately as Fellowship-Trained Knee Arthroplasty Surgeons?

Published:April 25, 2020DOI:



      Osteoarthritis (OA) is the leading cause of disability among adults in the United States. As the diagnosis is based on the accurate interpretation of knee radiographs, use of a convolutional neural network (CNN) to grade OA severity has the potential to significantly reduce variability.


      Knee radiographs from consecutive patients presenting to a large academic arthroplasty practice were obtained retrospectively. These images were rated by 4 fellowship-trained knee arthroplasty surgeons using the International Knee Documentation Committee (IKDC) scoring system. The intraclass correlation coefficient (ICC) for surgeons alone and surgeons with a CNN that was trained using 4755 separate images were compared.


      Two hundred eighty-eight posteroanterior flexion knee radiographs (576 knees) were reviewed; 131 knees were removed due to poor quality or prior TKA. Each remaining knee was rated by 4 blinded surgeons for a total of 1780 human knee ratings. The ICC among the 4 surgeons for all possible IKDC grades was 0.703 (95% confidence interval [CI] 0.667-0.737). The ICC for the 4 surgeons and the trained CNN was 0.685 (95% CI 0.65-0.719). For IKDC D vs any other rating, the ICC of the 4 surgeons was 0.713 (95% CI 0.678-0.746), and the ICC of 4 surgeons and CNN was 0.697 (95% CI 0.663-0.73).


      A CNN can identify and classify knee OA as accurately as a fellowship-trained arthroplasty surgeon. This technology has the potential to reduce variability in the diagnosis and treatment of knee OA.


      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to The Journal of Arthroplasty
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Murphy L.
        • Helmick C.G.
        The impact of osteoarthritis in the United States: a population-health perspective.
        Am J Nurs. 2012; 112: S13-S19
        • Cross M.
        • Smith E.
        • Hoy D.
        • Nolte S.
        • Ackerman I.
        • Fransen M.
        • et al.
        The global burden of hip and knee osteoarthritis: estimates from the Global Burden of Disease 2010 study.
        Ann Rheum Dis. 2014; 73: 1323-1330
        • Riddle D.L.
        • Jiranek W.A.
        • Hayes C.W.
        Use of a validated algorithm to judge the appropriateness of total knee arthroplasty in the United States: a multicenter longitudinal cohort study.
        Arthritis Rheumatol (Hoboken, NJ). 2014; 66: 2134-2143
        • LeCun Y.
        • Bengio Y.
        • Hinton G.
        Deep learning.
        Nature. 2015; 521: 436-444
        • Liu X.
        • Faes L.
        • Kale A.U.
        • Wagner S.K.
        • Fu D.J.
        • Bruynseels A.
        • et al.
        A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis.
        Lancet Digital Health. 2019; 1: e271-e297
        • Cabitza F.
        • Locoro A.
        • Banfi G.
        Machine learning in orthopedics: a literature review.
        Front Bioeng Biotechnol. 2018; 6: 75-95
        • Wright R.W.
        • Wright R.W.
        • Ross J.R.
        • Haas A.K.
        • Huston L.J.
        • Garofoli E.A.
        • et al.
        Osteoarthritis classification scales: interobserver reliability and arthroscopic correlation.
        J Bone Joint Surg Am. 2014; 96: 1145-1151
        • Bini S.A.
        Artificial intelligence, machine learning, deep learning, and cognitive computing: what do these terms mean and how will they impact health care?.
        J Arthroplasty. 2018; 33: 2358-2361
        • Gyftopoulos S.
        • Lin D.
        • Knoll F.
        • Doshi A.M.
        • Rodrigues T.C.
        • Recht M.P.
        Artificial intelligence in musculoskeletal imaging: current status and future directions.
        Am J Roentgenol. 2019; 213: 506-513
        • Rigby M.J.
        Ethical dimensions of using artificial intelligence in health care.
        AMA J Ethics. 2019; 21: 121-124
        • Obermeyer Z.
        • Emanuel E.J.
        Predicting the future—big data, machine learning, and clinical medicine.
        N Engl J Med. 2016; 375: 1216-1219
        • Esteva A.
        • Kuprel B.
        • Novoa R.A.
        • Ko J.
        • Swetter S.M.
        • Blau H.M.
        • et al.
        Dermatologist-level classification of skin cancer with deep neural networks.
        Nature. 2017; 542: 115-118
        • Luxton D.D.
        Should Watson be consulted for a second opinion?.
        AMA J Ethics. 2019; 21: 131-137
        • Agricola R.
        • Leyland K.M.
        • Bierma-Zeinstra S.M.A.
        • Thomas G.E.
        • Emans P.J.
        • Spector T.D.
        • et al.
        Validation of statistical shape modelling to predict hip osteoarthritis in females: data from two prospective cohort studies (Cohort Hip and Cohort Knee and Chingford).
        Rheumatology. 2015; 54: 2033-2041
        • Cheng C.-T.
        • Ho T.-Y.
        • Lee T.-Y.
        • Chang C.-C.
        • Chou C.-C.
        • Chen C.-C.
        • et al.
        Application of a deep learning algorithm for detection and visualization of hip fractures on plain pelvic radiographs.
        Eur Radiol. 2019; 29: 5469-5477
        • Badgeley M.A.
        • Zech J.R.
        • Oakden-Rayner L.
        • Glicksberg B.S.
        • Liu M.
        • Gale W.
        • et al.
        Deep learning predicts hip fracture using confounding patient and healthcare variables.
        NPJ Digital Med. 2019; 2: 31-41
        • Olczak J.
        • Fahlberg N.
        • Maki A.
        • Razavian A.S.
        • Jilert A.
        • Stark A.
        • et al.
        Artificial intelligence for analyzing orthopedic trauma radiographs.
        Acta Orthop. 2017; 88: 581-586
        • Bien N.
        • Rajpurkar P.
        • Ball R.L.
        • Irvin J.
        • Park A.
        • Jones E.
        • et al.
        Deep-learning-assisted diagnosis for knee magnetic resonance imaging: development and retrospective validation of MRNet.
        PLoS Med. 2018; 15: e1002699-e1002734
        • Xue Y.
        • Zhang R.
        • Deng Y.
        • Chen K.
        • Jiang T.
        A preliminary examination of the diagnostic value of deep learning in hip osteoarthritis.
        PLoS One. 2017; 12: e0178992-e0179021
        • Antony J.
        • McGuinness K.
        • O’Connor N.E.
        • Moran K.
        Quantifying radiographic knee osteoarthritis severity using deep convolutional neural networks.
        in: 2016 23rd International Conference on Pattern Recognition (ICPR). IEEE, New York, NY2016: 1195-1201
        • Tiulpin A.
        • Thevenot J.
        • Rahtu E.
        • Lehenkari P.
        • Saarakkala S.
        Automatic knee osteoarthritis diagnosis from plain radiographs: a deep learning-based approach.
        Sci Rep. 2018; 8: 1727-1787