Harness the power of AI and large amount of data to facilitate collaboration
During the last century, medical specialists were constantly writing down observations about patients in natural language. All these notes, together with lab tests, would then be stored in patients’ healthcare records.
Although digitization automated this process to some extent, there was still a wealth of information included in doctors’ observations. To make the most out of such notes and turn medical commentary into actionable information, a new paradigm is needed.
Of course, hand-coding data can be an option, but it fails to capture specific details of face-to-face interactions. According to InData Labs, text analytics powered by AI and Natural Language Processing (NLP) can offer a new approach to scanning such heaps of unstructured healthcare data.
The Curse and Power of Free Text
Let’s take a prescription as an example. The doctor writes: “Take two pills each morning and evening for two days, then one pill every morning for the next five days.” This type of information, although fairly simple, becomes a bit of a hassle to put in a tabular format – a necessary step if we want to keep track of different patients following different treatment schemes.
If this task seems a little tedious, imagine how hard it would be to list all the data linked to a patient such as medical history, family history, symptoms, allergies, test results, and much more, including comments and notes taken at previous examinations, by various nurses and doctors in their own words. However, these contain a wealth of knowledge about the patient and need to be viewed holistically.
The truth is that people communicate best through natural language, which includes narrative text and free annotations. Yet check-lists, codes, and terminologies can create confusion or miscommunication even for specialists. For example, it is easy for a doctor to forget to record a specific code among many, but it is less likely for them to forget to write down the affliction corresponding to the code.
Issues with Text Analytics
Working with text analytics comes with a set of challenges that include variability, ambiguity, and labeling.
First of all, there is the problem of variability. Due to synonyms, abbreviations, and language variations, the same illness can be noted in a number of different ways. For example, kidney failure, renal insufficiency, and CKD (chronic kidney disease) denote the same condition but seem to be entirely different diagnoses. To avoid this, the text analysis engine must be trained to recognize these equivalencies and treat them equally.
Next, there is the problem of ambiguity. The same word can have very different meanings — an issue which amplifies in the case of acronyms that can stand for a whole range of various diseases, treatments, substances or other coordinates. A misunderstanding in this context can become lethal, so text analysis needs to consider surrounding words or phrases as well.
Although some text analysis can be performed entirely by machine learning, due to the previously mentioned issues and implied health risks, the training data needs to be processed by an expert to avoid confusions. The labeling will need to be double-checked by specialists to ensure accuracy and relevance.
What Can NLP Do?
There are a few tasks in healthcare for which the text analytics framework is ideally suited:
- Summarizing
Feeding large amounts of data into the system can produce a condensed version of the input. For example, it could be useful in reviewing clinical notes or academic articles. This is done by breaking the text into small bits, identifying the fundamental concepts and reorganizing the information more directly.2.
- Speech recognition
You don’t have to talk only to Siri and Alexa. Soon there will be more AI nurses available to take notes right next to the physician performing the consultation.
The difference from existing speech-to-text systems will be precisely the overcoming variability and ambiguity. This way, the system can understand the context and take notes correctly, even if the person uses abbreviations or formulations which are a bit different from the canonical standard.
- Answering machines
Text analytics is already used in enhancing chatbots to make them capable of replacing call center agents and receptionists. In healthcare, these can be taken to a higher level and even used to replace costly visits to a doctor. For example, by creating an automated patient service system, people can get answers on the spot or be scheduled for a consultation.
These types of answering machines can also be used by doctors to query a patient’s records using natural language. Since the data sources in such a file vary from tables to observations, thus from structured to unstructured data, an NLP system will be necessary.
- Data conversion and mapping
We’ve already established that free-text formats are not the best for analysis and studies. Most of the time, unstructured or semi-structured data needs to be converted in a tabular form, as required by statistics software. Using NLP can take notes and transform them into a machine-friendly format.
A particular form of such conversion is data mapping. This means taking unstructured elements and placing them in the right fields of a predefined structure, thus ensuring a high degree of data integrity.
- Optical character recognition
There are already numerous text OCR options, which are based on a predefined dictionary. Once computer vision evolves, this feature will be extended to medical images like X-rays, CAT scans or even photographs of moles. A system trained to understand images can send that information to the patients’ file and other databases as needed, already coded, thus speeding up some processes and helping R&D.
Final Thoughts
Medical records paint a vivid portrait of patients, including diverse data sources. Making sense of them requires either a dedicated workforce or smart AI tools able to comb through the noise and extract the essentials related to anatomy, pathology, drugs, allergies, history, and even patient behavior.
Compared to the classic medical coding, using NLP for text analysis has the advantages of speed, capturing interdependencies, and offering the possibility of centralizing anonymous data for large collaborative studies.
Written by: Marta Robertson
Marta Robertson has over 7 years of IT experience and technical proficiency as a data analyst in ETL, SQL coding, data modeling, and data warehousing involved with business requirements analysis, application design, development, testing, documentation, and reporting, including implementation of the full lifecycle in data warehouses and data marts in the marketing industry.