The above examples confirm that the medical applications of NLP are a popular topic of many publications, all of which are impossible to cite in this article. However, our goal is different, as we would like to show the reader how to use the currently available tools and evaluate the results they provide. Accordingly, in the next chapter, we intend to present contemporary ways of annotating medical text, and in the following chapters, we will give our experimental results in this field.
Text annotation, a critical process in natural language processing (NLP), as explained in the fundamental work, involves identifying and labeling essential terms within a body of text.
As a result of the project carried out at the Gdańsk University of Technology in cooperation with the Gdańsk Medical University, a solution is being developed and implemented, which enables physicians to recall available diagnostic test results and clinical parameters of patients by voice, fill-in disease charts during medical history in interactive mode, create descriptions, and prescribe treatment as required. Templates will be generated automatically for filling them out, allowing data to be entered directly into widespread health care information systems, including data from the medical history, and automatically structured descriptions of diagnostic results that will be voice-editable and enable dictation of test referrals, prescriptions, or sick leaves. Due to the complexity of medical terminologies, the annotation process can be challenging. Still, it also provides immense value by structuring unstructured text data, thus aiding in research, diagnosis, treatment, and monitoring of diseases.
The role of annotating medical terms is paramount from the point of view of this project, as speech recognition models should be trained with the medical vocabulary in Polish, where most models do not work effectively. In addition, the adaptive forms, and the automation of voice-based medical record creation, in general, require contextual speech understanding. Training machine learning-based models for this purpose also involves the selection of medical terms from among the expressions present in longer texts. The block labeled Named Entity Recognition is related to this objective in the diagram reflecting our solution prepared with a current project, "ADMEDVOICE - Adaptive intelligent speech processing system of medical personnel with the structuring of test results and support of therapeutic process" is in Fig. 1. In preparation for solving the project tasks, a practical review of various NLP tools was carried out, selecting the more promising ones for testing with doctors of different medical specialties.
The development of specialized NLP tools like MedLEE and cTAKES has led to significant progress in biomedical informatics, particularly in improving the extraction, processing, and analysis of clinical data from various medical texts and records. The article by Friedman on the research and development of the MedLEE system contributes valuable insights into the early stages of NLP application in medical informatics, demonstrating the potential of language processing tools in enhancing the analysis and interpretation of clinical data. The article extensively presents the architecture and components of the cTAKES system and its applications in medicine. cTAKES is one of the first natural language processing systems developed specifically for medical texts, utilizing information extraction and semantic text processing methods. The paper provides insights into how cTAKES analyzes clinical documents and extracts critical medical information.
In addition to the tools listed above, other modern solutions suitable for these purposes are currently available, including advanced AI systems like ChatGPT and Google's Gemini (formerly named Bard), which excel in handling such tasks. There are also well-known solutions offered by companies that incorporate off-the-shelf NLP models, such as Amazon Comprehend Medical, AWS HealthScribe, Microsoft Azure AI-Language, and others.
In December 2023, Google made an advancement in artificial intelligence development that promises to benefit all users. They entered the Gemini era, incorporating the latest AI features across text, image, sound, and video processing. Their most sophisticated model, Ultra 1.0, set a new benchmark by outperforming experts in the Massive Multitask Language Understanding (MMLU) test. This evaluation measures knowledge and problem-solving abilities across 57 fields, including mathematics, physics, history, law, medicine, and ethics. As of February 2024, they are taking another step forward by integrating the Ultra model into their services. This integration of AI's role within the medical domain offers unprecedented opportunities for enhancing natural language processing applications in healthcare. By leveraging Ultra 1.0's advanced understanding capabilities, medical professionals and researchers can expect significant improvements in data analysis, diagnosis processes, and patient care, marking a pivotal moment in the convergence of AI technology and medical science.
Also, PubTator presents an enhanced version of PubTator, a tool for the automated annotation of biomedical concepts in full-text scientific articles. Prodi.gy includes a case study demonstrating the use of Prodi.gy in annotating various texts, highlighting its flexibility and efficiency in annotation. The paper devoted to Llama Alpaka discusses its use for advanced language model analysis and knowledge annotation in large data sets. SenTag describes a web-based tool for semantic annotations of textual documents. INCEpTION presents the INCEpTION platform, combining machine-assisted and knowledge-based interactive annotation. The paper describes Brat, a web-based tool for NLP-assisted text annotation. Doccano introduces a user-friendly text annotation tool that supports multiple data formats. LightTag is a scalable platform for structured text annotation, emphasizing its performance and ease of use. These tools can identify and annotate various medical terms, making it easier for researchers and clinicians to extract meaningful insights from the data. Moreover, the medical field has developed various other annotation tools. BioC is one such minimalist tool designed to resolve issues related to interoperability among different software systems used for text mining.
Another popular software BERT, is a web-based tool for NLp-assisted text annotation. BERT (Bidirectional Encoder Representations from Transformers) is a groundbreaking language model developed by Google first in 2018. It is designed to pre-train deep bidirectional representations from the unlabeled text by joint conditioning on all layers' left and right contexts. BERT can accurately predict the grammatical parts of speech of words in a sentence, such as nouns, verbs, adjectives, etc. This is crucial for understanding the structure of Polish sentences, which have a rich morphology and flexible word order. BERT's deep contextual understanding can be leveraged to disambiguate words with multiple meanings based on their context. This is particularly useful in Polish, where many words can have different meanings in different contexts.
The paper adapts the breakthrough BERT transformer-based model (Bidirectional Encoder Representations from Transformers) for clinical applications and demonstrates its effectiveness in natural language analysis in a medical context. These could include disease names, drug names, symptoms, and other relevant health information in the medical field. The authors of this article also tested the BERT model in the context of its capabilities to recognize words belonging to the Polish medical language (see Chapter V). BERT is also used and mentioned in literature for labeling nonclinical applications.
Different models have been proposed for chemical-named entity recognition. This has a connection with medicine, as the names of chemicals sometimes appear in the names of drugs. One such model uses transformer models, as described by Devlin et al. and Rajkomar et al.. Both discussed the application of automated encoding of clinical documents based on natural language processing. The article comprehensively surveys the advancements in Transformer models, focusing on their efficiency improvements. It discusses how the Transformer architecture, initially introduced by Vaswani et al. in 2017, has become foundational for many modern NLP models, including those used for Named Entity Recognition (NER). The survey covers various modifications and optimizations that have been proposed to enhance the Transformer's performance and adaptability in diverse NLP applications. This reference offers a broader perspective on the evolution and optimization of the Transformer architecture since its introduction, highlighting its continuous impact on the NLP field.
In their article, Honnibal and Montani, the creators of SpaCy, present key aspects and innovations introduced in version 2 of this natural language processing (NLP) tool. They discuss the use of Bloom Embeddings, convolutional neural networks, and incremental parsing in the context of natural language understanding. The paper details the technologies and methods contributing to the efficiency of SpaCy as an NLP tool.
To precisely compare the various tools and models utilized in medical text annotation, we summarize each tool's essential features, supported languages, specializations, strengths, and weaknesses. This comparison highlights these tools' unique capabilities and limitations, assisting in understanding their applicability to different medical contexts. The details are summarized in Table 1.
In clinical and biomedical texts, a range of terms requires annotation. These may include clinical findings, anatomical sites, procedures, medications, etc. To illustrate that simply, let us consider a patient's diagnosis (in Polish): "Pacjent cierpi na nadciśnienie tętnicze". The diagnosis states that the patient suffers from hypertension. In this sentence, "nadciśnienie tętnicze" is the medical term that refers to hypertension and translates to "hypertensio arterialis" in Latin. Meanwhile, "patient" (pacjent) "suffers" (cierpi) are words that will be repeated many times in speeches, as they rather belong to colloquial language, which non-specialized speech recognition systems can handle quite well.
In the remainder of this article, we will provide some examples of Python programming language code that allows the effective extraction of medical concepts from longer statements made by medical personnel. Such selection is important for training speech-to-text transcription systems since the overuse of colloquial words repeated in natural language can lead to overfitting deep neural models during transfer learning.
Then, we will show what material we used for our experiments and what results we obtained from them. Verification by specialized medical personnel of the obtained automatic results will allow us to calculate evaluation metrics, through which it will become possible to estimate the effectiveness of the software for this purpose. On this basis, we will be able to summarize this work and draw general conclusions.