135x Filetype PDF File size 0.74 MB Source: www.thieme-connect.com
Article published online: 2022-06-02 243 © 2022 IMIA and Georg Thieme Verlag KG Natural Language Processing: from Bedside to Everywhere 1 1 1 2 Eiji Aramaki , Shoko Wakamiya , Shuntaro Yada , Yuta Nakamura 1 Nara Institute of Science and Technology (NAIST), Nara, Japan 2 Division of Radiology and Biomedical Engineering, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan Summary 1 Introduction of NLP? To clarify these questions, this study Objectives: Owing to the rapid progress of natural language investigates what clinical/medical NLP has processing (NLP), the role of NLP in the medical field has radi- Electronic health/medical records (referred achieved in different clinical/medical fields. cally gained considerable attention from both NLP and medical to as EHR in this study) are rapidly re- This review aims to provide a guide informatics. Although numerous medical NLP papers are pub- placing paper-based records in hospitals for the NLP specialist who does not know lished annually, there is still a gap between basic NLP research worldwide. Natural language processing medical informatics well enough. The scope and practical product development. This gap raises questions, (NLP) techniques have gained importance of this paper is related to studies that have such as what has medical NLP achieved in each medical field, in the medical field. Because NLP is a hot the potential to directly contribute to daily and what is the burden for the practical use of NLP? This paper topic in computer science, the number of clinical practice, which we call bedside ap- aims to clarify the above questions. medical NLP studies is increasing each year plications, consisting of internal medicine, Methods: We explore the literature on potential NLP products/ dramatically. pre-surgery, post-surgery, oncology, radiol- services applied to various medical/clinical/healthcare areas. Despite the large number of studies, only ogy, pathology, psychiatry, rehabilitation, Results: This paper introduces clinical applications (bedside a few practical studies have validated medi- obstetrics, and gynecology, etc. This paper applications), in which we introduce the use of NLP for each cal NLP applications in real-world settings. introduces existing ready-to-use systems clinical department, internal medicine, pre-surgery, post-surgery, Studies using randomized controlled trials used in the above fields and summarizes oncology, radiology, pathology, psychiatry, rehabilitation, ob- (RCTs), which have the highest medical its current methodology and performance. stetrics, and gynecology. Also, we clarify technical problems to be evidence, are rare. In the PubMed search for Finally, we mention future potential NLP addressed for encouraging bedside applications based on NLP. “NLP” + “RCT” or “Clinical trial,” we could applications not only for hospital use but Conclusions: These results contribute to discussions regarding find few studies only [1–4]. Instead of RCT also for patient use. potentially feasible NLP applications and highlight research gaps studies, several studies employed a retro- for future studies. spective study using EHR big data: screening of diseases, case classification, incident de- Keywords tection, etc. [5–8]. However, unlike medical 2 Bedside Applications image software, these systems have not been Natural language processing, medical application, chatbot, commercialized as a product. A similar trend We provide an overview of how far NLP randomized controlled trial, social media can be observed in the approved applications can be applied to outpatient and inpatient of the Food and Drug Administration (FDA) diagnosis, treatment, or management in Yearb Med Inform 2022:243-53 1 as artificial intelligence (AI) systems . Most each department. Historically, shared tasks http://dx.doi.org/10.1055/s-0042-1742510 were audiology devices, and no medical have been one of the effective ways for re- systems related to NLP were found. searchers to drive fundamental innovations In summary, NLP has been actively in the clinical NLP [9]. This is a competitive studied, but there is still a gap between basic platform where organizers present a techni- research and practical product development. cally challenging and clinically meaningful This raises several questions, including what task along with the dataset, gold standards, has medical NLP achieved in each medical and evaluation criteria. In the early days, field, and what is the burden for practical use simple tasks were chosen, such as classi- 1 fying patient records based on smoking https://www.fda.gov/medical-devices/ status [10]. These days, shared tasks deal software-medical-device-samd/artificial- with far more complex problems, such as intelligence-and-machine-learning-aiml- enabled-medical-devices temporal relationship recognition among IMIA Yearbook of Medical Informatics 2022 244 Aramaki et al. clinical events in discharge summaries [11], (i) Disease prevention. NLP can identify system in the EHR system that identifies epi- risk factor identification in longitudinal series risk factors, estimate risk, or predict leptic outpatients with indications of surgery of progress notes [12], and clinical decision events of disease development or read- with SVM. The system achieved ROC-AUC support [13–15]. Over time, reproducibility of missions [12, 31, 32]. Wang et al. au- of 0.79 in recommending operation [24]. solutions and techniques found in shared tasks tomatically calculated CHA DS -VASc Fonferko-Shadrach et al. developed an NLP 2 2 have been demonstrated by researchers, which and HAS-BLED, the risk scores for the system to review clinic letters and auto- has promoted advancements in clinical NLP. cerebral stroke of atrial fibrillation pa- matically extract symptoms, diagnosis, and We surveyed how far NLP applications tients, by a rule-based approach. They medication history of preoperative patients. have been proven to be replicable in real-world also identified patients with a high risk The system was based on an existing entity clinical practice. We made no limitations on of cerebral stroke with positive predic- linking tool and demonstrated F1-score of hospital departments in searching publications. tive values of 0.92–1.00 [33]. Buchan 0.911 [38]. We referred to (i) reviews and systematic et al. analyzed clinical notes of patients reviews published in 2017 or later and (ii) orig- without a history of coronary artery Post-surgery inal research articles published in 2020 or later disease (CAD) with named entity Perioperatively and postoperatively, NLP on NLP applications for each hospital depart- recognition (NER) and support vector contributes to continuous quality improve- ment. We searched PubMed for publications machine (SVM), and identified patients ment efforts. NLP can identify complications using the keyword “natural language process- with later development of CAD with and their details in unstructured free-text ing” for reviews and systematic reviews, and F1-score of 0.774 [34]; clinical records, even if they are not codified “natural language processing”, and a hospital (ii) Early diagnosis. NLP can help clini- with ICD-10 (International Classification of department name together for original research cians recognize diseases out of their th articles. Because this article is not a systematic specialty that might otherwise be Diseases -10 revision) [29, 39]. Bucher et review, we focused on studies that can directly misdiagnosed or overlooked without al. identified surgical site infections (SSIs) contribute to daily clinical practice. Although proper transfer. Chase et al. achieved with an NLP pipeline that parses and extracts NLP is also helpful in research-oriented appli- area under a receiver operating char- information from clinical notes reaching cations, such as cohort building with patient acteristic curve (ROC-AUC) of 0.94 ROC-AUC of 0.912. The system also deter- identification or phenotyping [16], evidence in classifying patients with and with- mined SSI subgroups based on the depth, generation using clinical free-text [17–19], out multiple sclerosis using NER and the wound condition, and the outcome [29]. or semi-automation of meta-analysis [20] and Naïve Bayes classifiers. They also Furthermore, surgical outcomes can also be systematic review [21–23], these are beyond identified patients suspected of undi- automatically extracted from unstructured the scope of this article. agnosed multiple sclerosis [35]; free-text using NLP, which aids labor-inten- (iii) Treatment support. Clinical decision sive manual chart review. In orthopedics, support tools to summarize patient clin- hip dislocation after total hip arthroplasty 2.1 Applications in Different ical information and suggest treatment can be detected [40]. Tibbo et al. developed Departments are beginning to be realized. Seol et al. an NLP system to automatically determine integrated a clinical decision support Vancouver classification of periprosthetic NLP-based technology has enabled infor- tool into the EHR system for pediat- femur fractures with the sensitivity of 0.786 mation extraction (IE) from various un- ric asthma outpatients, which warns and specificity of 0.948 [41]. structured free-text documents such as clinic of the risk of acute exacerbation and letters, progress notes, discharge summaries, recommends an optimal treatment plan Oncology and test reports. This technology can im- based on free-text and structure data in Oncology is another department where NLP prove care quality in multiple departments, the EHR [25]. An RCT demonstrated plays an important role [30, 42]. which has been demonstrated mainly in improvement of patient outcomes and (i) IE and cancer registration. NLP helps retrospective studies and sometimes in pro- significantly reduced physicians’ work- information retrieval on genetic, his- spective studies [24–27]. NLP performance load for manual chart review. tological, and clinical characteristics has also been validated in multicenter studies of cancer, which is essential in clinical [28, 29]. See also Table 1 for details of the decision making and surveillance for NLP systems introduced below. Pre-surgery effective public health interventions Internal Medicine NLP has the potential to aid in identifying [43, 44]. The information includes clinical conditions of preoperative, perioper- histological type, differentiation, Ki-67 NLP aids in the prevention, early diagnosis, ative, and postoperative patients [36, 37]. In index, TNM (classification of malignant treatment, and prognostic prediction of a preoperative settings, NLP can (i) evaluate tumors) staging, test findings, treatment, wide range of diseases, such as cardiovas- surgical indications and (ii) reduce the work- family history, and performance status. cular, endocrine, metabolic, hepatobiliary, load of preoperative assessment. Wissel et Benjamin et al. automatically extracted and neurological diseases [30]. al. implemented an automatic NLP scoring quantitative information of biomarkers IMIA Yearbook of Medical Informatics 2022 245 Natural Language Processing: from Bedside to Everywhere from breast cancer pathology reports. (iv) Surveillance. Radiology reports some- diseases with free-text discharge summaries. They achieved an accuracy of 0.98 times point out incidental findings. Their system achieved a micro F1-score with a rule-based approach on top of an NLP can help prevent such findings of 0.584 using multiple classifiers based existing NER tool MetaMap [45, 46]; from being missed by the attending on pre-trained Robustly Optimized BERT (ii) Clinical decision support. Precision physician by automatically sending pretraining Approach (RoBERTa) models medicine is a tailor-made clinical alerts [49–51]. [72, 73]. More fundamentally, NLP can con- practice considering individual patient tribute to psychiatric diagnostics. The Re- demographics and cancer genetic Pathology search Domain Criteria (RDoC), a potential characteristics. NLP can recommend NLP is helpful for both pathologists, whose counterpart of the Diagnostic and Statistical optimal treatment plans by searching responsibility is increasing in the era of Manual of Mental Disorders (DSM), aims biomedical articles and clinical trial personalized medicine, and clinicians, who to integrate brain research knowledge into repositories using patient information refer to the diagnosis for treatment planning. psychiatric disease classification [74], for as a query [13–15, 47]. Li et al. released (i) Support diagnosis. NLP can support which NLP shared tasks were held in 2016 a chatbot-style open access clinical pathologists by providing a better and 2019 [75, 76]. decision support tool [48]. computer-based image retrieval system incorporating pathology reports [59] or Rehabilitation Radiology by automated pathology reporting [60]; NLP is used in speech therapy by incorpo- NLP can contribute to multiple stages of (ii) Support clinical practice. Information rating it into electronic devices for augmen- the radiological clinical workflow [49–51]. on pathological diagnosis is used tative and alternative communication (AAC) (i) Patient safety. NLP can help screen afterward by clinicians for better [77, 78]. Moreover, NLP has the potential patients for contraindications to diag- treatment strategy. NLP helps convert to better unite the entire rehabilitation into nostic imaging. Valtchinov et al. iden- unstructured pathology reports into a the healthcare process by enabling the inte- tified implants with contraindication structured form [45, 57, 61]. Kim et al. gration of the International Classification of to magnetic resonance imaging (MRI) automatically extracted descriptions of Functioning, Disability, and Health (ICF) in clinical notes with accuracies of a specimen, procedure, and pathologic into EHRs, although there are still problems 0.83–0.91 with NER [52]; diagnosis from pathology reports re- to overcome [79]. (ii) Imaging protocol recommendation. gardless of clinical departments. Their NLP can determine the use of contrast deep learning-based system, which Obstetrics and Gynecology agents or optimal imaging protocols uses Bidirectional Encoder Represen- based on free-text in ordering com- tations from Transformers (BERT), Publications on bedside NLP applications ments or clinical records [53–56]. achieved accuracies of 0.9795–0.9839 were found in obstetrics and gynecology, Chillakuru et al. developed a machine [57, 62]. At a more fine-grained level, although limited in number. Moon et al. learning-based NLP system to recom- Odisho et al. extracted seventeen types showed the effectiveness of a rule-based mend the use of contrast agents for of information from prostate cancer pa- NLP approach to highlight information brain and spinal MRI with accuracies of thology reports and achieved a weight- discrepancies on surgical history due to 0.83–0.85, of which an online demo is ed F1-score of 0.972 for categorical misinterpretation during hospital transfer or available. The system is based on term data and a mean accuracy of 0.930 for improper copy and paste [80]. Sterckx et al. frequency-inverse document frequency numerical data. They applied document developed a birth risk prediction system to vectorization, Gradient Boosting Deci- classification with convolutional neural support preterm birth treatment, which was sion Tree (GBDT), word embeddings, network (CNN) to categorical data and based on GBDT. NER-based features im- and shallow neural networks [54]. token classification with random forest proved prediction performance when com- Some other scan optimization tools are to numerical data [61]. bined with structured data, with F1-score of commercially available [55]; birth prediction within 24 hours over 0.80 (iii) Automated radiology reporting. As the Psychiatry [81]. Barber et al. used NLP for prognostic workload of diagnostic radiologists In psychiatry, NLP can be used for IE from prediction of ovarian cancer surgery, where rapidly grows [57], automated radiol- unstructured EHR and speech analysis postoperative readmission within 30 days ogy report generation in cooperation on patient speech data [63, 64]. NLP can was predicted with ROC-AUC of 0.70 using with computer vision AI is attracting help in the screening, early diagnosis, or preoperative CT radiology reports [82]. attention [58]. Most studies have dealt severity estimation of various diseases such Other Departments with chest X-rays thus far, and further as depression [63], bipolar disorder [65], application to computed tomography dementia [66–68], psychosis [69, 70], and NLP application is limited in ophthalmology (CT), MRI, and nuclear medicine is schizophrenia [71]. Dai et al. showed that and anesthesiology, where most AI systems expected; NLP automatically diagnosed psychiatric are devoted to automated image diagnosis IMIA Yearbook of Medical Informatics 2022 246 Aramaki et al. [83] or intraoperative monitoring with nu- (ii) Auto-structuring. Some clinical doc- slightly more standardized terms because merical data [84]. However, some studies uments such as progress notes or they are exchanged between diagnosing combine NLP for unstructured free-text nursing notes are required to be in a doctors and radiologists. Distributions of the documents and AI for structured EHR data structured form. NLP allows healthcare appearing clinical terms in different types to predict patient prognosis [85]. NLP also professionals to write such documents of clinical notes of different departments has the potential to automatically pick up in an unstructured narrative by auto- also deviate substantially, leading to uneven patient risk factors preoperatively. matic editing and structuring. Moen performance even when using an identical As indicated above, NLP can improve et al. structured Finnish nursing notes model architecture [96]. the quality and efficiency of bedside clinical into paragraphs whose headings were To adapt for a wide range of clinical note practice mainly by IE from unstructured selected from standardized taxonomy types with a single annotation scheme, some free-text for various departments and dis- with an accuracy of 0.71 using a Long studies propose general-purpose annotation eases, a part of which has already been put Short-Term Memory (LSTM)-based guidelines that define popular medical en- to practical use. sentence classification [89]. Further- tities (e.g., diseases, drugs, tests, remedies, more, patient-staff conversations can and body parts), as well as semantic rela- be automatically structured once tran- tionships among them (e.g., “a medicine ‘is- 2.2 Cross-cutting Applications scribed [90, 91]; subscribed-for’ a disease” and “a symptom Some NLP applications are not limited to (iii) Digital scribe. Digital scribe is different ‘was-found-in’ an anatomical part”) [96–99]. specific hospital departments but can be from dictation but similar to auto-struc- However, this approach increases the com- helpful widely. We introduce such applica- turing except for using voice input. plexity of the resulting annotation schemes, tions in this subsection. That is, clinicians have only to record making training annotators expensive. One an outpatient conversation with some guideline of such schemes has more than 30 additional voice command, and the pages [100]; a temporal IE corpus provides Text Simplification NLP system analyzes and summarizes a 63 pages-long guideline document [101]. Clinical texts can sometimes be difficult for the conversation and converts it into The complexity of annotation schemes patients or clinicians in other departments a clinical document in a predefined can also generate ambiguous boundaries due to jargon or abbreviations. Automated format [92–95]. Wang et al. developed between multiple entity types. For example, text simplification with NLP can improve a digital scribe system, which was a general-purpose corpus [99] defines ‘Dis- both patient-staff and staff-staff communi- 2.17–3.12 times faster than typing ease’ entity and ‘Signs or Symptoms’ entity cation [86, 87]. Moen et al. developed an and dictation during patient encounter separately, the inter-annotator agreement of NLP system to suggest replacements for documentation [95]. which was relatively low probably because abbreviations in Finnish clinical texts that are of the annotators’ confusion. difficult for patients. The system achieved top-1 accuracy of 0.3464 with an unsuper- 3 Problems to be Addressed vised approach using cosine similarity of 3.2 Task Formulation word embeddings [87]. 3.1 Standard Annotation Schemes There are always several ways to formulate Writing Support Most NLP-based IE techniques adopted in a medical/clinical problem into an NLP task. the studies we referred to thus far use su- The difference in task formulation affects Writing support with NLP can solve more pervised machine learning, which requires overall performance and how to create an fundamental problems that illegible clinical high-quality, large datasets for training. annotated corpus. Careful design of an NLP texts often result from a shortage of time of Creating such datasets relies on manual task setting translated from clinical needs healthcare professionals for documentation. annotation and thus increases the cost. matters. Taking adverse drug event (ADE) (i) Auto-completion. Auto-completion is The formats and conventions of writing detection as an example, we have at least a real-time suggestion of the next word clinical documents differ not only in docu- three options in its task formulation: NER, or clinical concept while a healthcare ment types (e.g., EHRs, radiology reports, relation extraction (RE), and text classifica- professional writes a clinical docu- and nursing notes), but also in hospitals, tion. We represent these different approaches ment. Gopinath et al. developed an departments, and even individual doctors. in Figure 1. The example sentence implies auto-completion system for the emer- This textual diversity requires medical NLP that a medication “nivolumab” prescribed gency department that suggests clinical researchers to create dedicated corpora for for a “laryngeal cancer” adversely caused conditions, symptoms, medications, different applications by designing distinct “liver damage.” As we mentioned below, and laboratory test items during the annotation schemes. For instance, doctors each approach has its own benefits and draw- documentation of progress notes. The often write disease name abbreviations backs. This trade-off suggests that we must system reduced the keystroke burden in EHRs owing to the nature of personal carefully design NLP approaches against by 67% [88]; note-taking, while radiology reports contain given medical/clinical IE issues. IMIA Yearbook of Medical Informatics 2022
no reviews yet
Please Login to review.