Ultrasound is integral to the management of thyroid nodules. However, the interpretation of thyroid ultrasound studies is time- and training-intensive and has inherent inter-observer inconsistencies, resulting in a decrement in its positive predictive value (1). Compounding these limitations, fine-needle aspiration (FNA) yields 15% to 30% cytologically indeterminate nodules, which often leads to unnecessary diagnostic surgeries (2). This spurred the initiation of a new research space where artificial intelligence (AI) is steered to improve preoperative sonographic risk stratification of thyroid nodules, and to minimize overdiagnosis and overtreatment of thyroid cancer. Li et al. conducted a multi-institutional effort to use AI-enabled algorithms that appear to supersede the previous machine learning sonographic classifiers of thyroid nodules (3). To that end, they used a large sonographic set of more than 300,000 images, thereby eliminating a common barrier to similar studies—the lack of sufficiently large curated datasets. Li et al. compared their deep convolutional neural network (DCNN) model’s performance against the judgment of expert radiologists per the standard Thyroid Imaging Reporting and Data System (TI-RADS) guidelines and pathological examination results, which represent the gold standard for diagnosis (4).
Previously, there has been an unmet clinical need to grow out of the traditional machine learning thyroid nodule classifiers with their inherent pitfalls such as dependence on expert-designated features as inputs (5). However, Li et al. successfully demonstrate the capacity of a performance-weighted combination of two of the most popular deep learning classification models (i.e., ResNet-50 and Darknet-19 CNN architectures) to rapidly-produce promising results in a real-world setting. The DCNN model from Li et al. showed superior specificity and accuracy, and similar sensitivity to expert radiologists’ judgment, all in the setting of an externally validated study. The latter suggested model generalizability.
Appropriately, the authors reported caveats to their work including the absence of multicenter training cohorts and not fully accounting for the potential confounding effect of nodule size and thyroid cancer subtypes, and an almost exclusively northern Han Chinese population. Nevertheless, this study offers valuable insights into the implications of deep learning in thyroid nodules sonographic risk stratification. Notably, deep learning performs better than conventional computer-assisted diagnosis (CAD) systems designed for thyroid nodule recognition especially in overcoming the challenge of heterogeneity (e.g., thyroid nodule biology, different ultrasound equipment manufacturers, etc.). CAD, given its more user-friendly nature, has a lower implementation barrier. Hence, we need to acknowledge the effort by Li et al. to introduce a result report interface that instantly projects to a graphical processing unit and has the potential to be integrated into ultrasound equipment. Complementing AI projects with ready-to-use application programming interfaces (API) serves as a bridge between developers and clinical stakeholders and to expedite new AI tools FDA clearance, adoption, and potential commercialization (6).
Importantly, embracing AI into clinical research and mainstream practices should be done responsibly and openly to maximize the benefits and avoid any potential or collateral damage (7). To ensure the generalizability of these results, a commitment to promoting data sharing in accordance with the FAIR guiding principles for scientific data management should be promoted (8). Li et al. are to be commended for going beyond just publishing “significant” results to the development of a freely available online platform that executes the deep learning framework. This under-construction website will permit prospective validation and ensure health equity across underserved regions and countries where expertise in radiological imaging interpretation might require optimization. The premise of reduced financial toxicity and psychological burden of unnecessary interventions, therefore, is encouraging indeed as is the opportunity to provide expertise required to assess thyroid nodules that may not be available in all areas of the world.
AI is invariably changing the oncology landscape and the efforts by Li et al. represent one benchmark for future AI algorithm development for thyroid nodules risk stratification. They provide a real potential for transforming this capability into a widely applicable clinical tool. However, what Li et al. make abundantly clear is that their model is complementary to rather than a substitute for manual diagnosis of thyroid cancer, reinforcing the importance of the partnership between clinicians and AI experts. Consequently, the oncology community must embrace and invest in AI via personnel education, collaborative research, and resource allocation in order to create more innovative solutions for real-world clinical challenges.
Funding: Dr. Clifton D. Fuller received US federal funding and salary support unrelated to this project from the National Institutes of Health (NIH), including: the National Institute for Dental and Craniofacial Research Establishing Outcome Measures Award (1R01DE025248/R56DE025248) and an Academic Industrial Partnership Grant (R01DE028290); NCI Early Phase Clinical Trials in Imaging and Image-Guided Interventions Program (1R01CA218148); an NIH/NCI Cancer Center Support Grant (CCSG) Pilot Research Program Award from the UT MD Anderson CCSG Radiation Oncology and Cancer Imaging Program (P30CA016672) and an NIH/NCI Head and Neck Specialized Programs of Research Excellence (SPORE) Developmental Research Program Award (P50 CA097007). Dr. Fuller received funding and salary support unrelated to this project from: National Science Foundation (NSF), Division of Mathematical Sciences, Joint NIH/NSF Initiative on Quantitative Approaches to Biomedical Big Data (QuBBD) Grant (NSF 1557679); NSF Division of Civil, Mechanical, and Manufacturing Innovation (CMMI) standard grant (NSF 1933369) a National Institute of Biomedical Imaging and Bioengineering (NIBIB) Research Education Programs for Residents and Clinical Fellows Grant (R25EB025787-01); the NIH Big Data to Knowledge (BD2K) Program of the National Cancer Institute (NCI) Early Stage Development of Technologies in Biomedical Computing, Informatics, and Big Data Science Award (1R01CA214825). Direct infrastructure support for Dr. Fuller is provided by the multidisciplinary Stiefel Oropharyngeal Research Fund of the University of Texas MD Anderson Cancer Center Charles and Daneen Stiefel Center for Head and Neck Cancer and the Cancer Center Support Grant (P30CA016672) and the MD Anderson Program in Image-guided Cancer Therapy. Dr. Fuller has received direct industry grant support, honoraria, and travel funding from Elekta AB. Dr. Reid F. Thompson was supported by the U.S. Department of Veterans Affairs under award number 1IK2CX002049-01.
Provenance and Peer Review: This is an invited article commissioned and reviewed by the Section Editor Dr. Shi-Tong Yu (Department of General Surgery, Nanfang Hospital, Southern Medical University, Guangzhou, China).
Conflicts of Interest: Dr. Pei Yang: kq1801105, Changsha Science and Technology Bureau. Dr. Mohamed Abazeed: Bayer AG, grant support, travel support and honorarium; Siemens Healthcare, USA, grant support. Dr. Chirag Shah: Consultant- Impedimed, Grants- Varian Medical Systems, PreludeDX, VisionRT. The other authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The contents do not represent the views of the U.S. Department of Veterans Affairs or the United States Government.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
- Ozel A, Erturk SM, Ercan A, et al. The diagnostic efficiency of ultrasound in characterization for thyroid nodules: how many criteria are required to predict malignancy? Med Ultrason 2012;14:24-8. [PubMed]
- Alexander EK, Kennedy GC, Baloch ZW, et al. Preoperative diagnosis of benign thyroid nodules with indeterminate cytology. N Engl J Med 2012;367:705-15. [Crossref] [PubMed]
- Li X, Zhang S, Zhang Q, et al. Diagnosis of thyroid cancer using deep convolutional neural network models applied to sonographic images: a retrospective, multicohort, diagnostic study. Lancet Oncol 2019;20:193-201. [Crossref] [PubMed]
- Kwak JY, Han KH, Yoon JH, et al. Thyroid imaging reporting and data system for US features of nodules: a step in establishing better stratification of cancer risk. Radiology 2011;260:892-9. [Crossref] [PubMed]
- Prochazka A, Gulati S, Holinka S, et al. Classification of thyroid nodules in ultrasound images using direction-independent features extracted by two-threshold binary decomposition. Technol Cancer Res Treat 2019;18:1533033819830748. [Crossref] [PubMed]
- Elhalawani H, Fuller CD, Thompson RF. The potential and pitfalls of crowdsourced algorithm development in radiation oncology. JAMA Oncol 2019;5:662-3. [Crossref] [PubMed]
- Elhalawani H, Lin TA, Volpe S, et al. Machine learning applications in head and neck radiation oncology: lessons from open-source radiomics challenges. Front Oncol 2018;8:294. [Crossref] [PubMed]
- Wilkinson MD, Dumontier M, Aalbersberg IJ, et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 2016;3:160018. [Crossref] [PubMed]