- What are the advantages of using NLP to analyze patient narratives?
- What are the potential pitfalls?
- Which elements of ML-based NLP are most likely to affect the performance of a system for analyzing patient narratives?
Patient narratives about experiences with health care contain a wealth of information about what is important to patients. These narratives are valuable for both identifying strengths and weaknesses in health care and developing strategies for improvement. However, rigorous qualitative analysis of the extensive data contained in these narratives is a resource-intensive process, and one that can exceed the capabilities of human analysts. One potential solution to these challenges is natural language processing (NLP), which uses computer algorithms to extract structured meaning from unstructured natural language. Because NLP is a relatively new undertaking in the field of health care, the authors set out to demonstrate its feasibility for organizing and classifying these data in a way that can generate actionable information.
In doing so, the authors focused on two steps that must be performed by a machine learning (ML) system designed to classify narratives into such codes as those typically applied by human coders (e.g., positive or negative statements regarding care coordination). These steps are (1) numerically representing the text data (in this case, entire narratives as they are provided by patients) and (2) classifying the data by codes based on that representation. The authors also compared four related approaches to deploying ML algorithms, identified potential pitfalls in the processing of data, and showed how NLP can be used to supplement and support human coding.
- The success of the fairly simple models described in this pilot study supports the promise of these approaches for analyzing patient narratives at larger scale.
- There is labor-saving potential in leveraging the strengths of both machine and human coders, potentially in creative ways.
- Coding performance was significant even with relatively off-the-shelf computing equipment and routines and would likely improve with even modest computing investments.
- Perhaps the most obvious opportunity for additional investment is increasing the size of the data set on which to train the models, which the authors expect would improve performance.
- Efficiency may be gained by contracting model building to specialized companies.
- Broad stakeholder discussions could help coordinate use of NLP for patient narratives.
Table of Contents
Background on Machine Learning Approaches to Natural Language Processing
Data Source and Methods
Performance of Demonstration Models