Using Natural Language Processing to Code Patient Experience Narratives
Capabilities and Challenges
ResearchPublished Oct 7, 2020
Given both the value of patient narratives for improving the quality of health care and the challenges of analyzing the data they contain, the authors explore the feasibility of using natural language processing to extract actionable information from these narratives.
Capabilities and Challenges
ResearchPublished Oct 7, 2020
Patient narratives about experiences with health care contain a wealth of information about what is important to patients. These narratives are valuable for both identifying strengths and weaknesses in health care and developing strategies for improvement. However, rigorous qualitative analysis of the extensive data contained in these narratives is a resource-intensive process, and one that can exceed the capabilities of human analysts. One potential solution to these challenges is natural language processing (NLP), which uses computer algorithms to extract structured meaning from unstructured natural language. Because NLP is a relatively new undertaking in the field of health care, the authors set out to demonstrate its feasibility for organizing and classifying these data in a way that can generate actionable information.
In doing so, the authors focused on two steps that must be performed by a machine learning (ML) system designed to classify narratives into such codes as those typically applied by human coders (e.g., positive or negative statements regarding care coordination). These steps are (1) numerically representing the text data (in this case, entire narratives as they are provided by patients) and (2) classifying the data by codes based on that representation. The authors also compared four related approaches to deploying ML algorithms, identified potential pitfalls in the processing of data, and showed how NLP can be used to supplement and support human coding.
The research described in this report was prepared for the Agency for Healthcare Research and Quality (AHRQ) and conducted by RAND Health Care.
This publication is part of the RAND research report series. Research reports present research findings and objective analysis that address the challenges facing the public and private sectors. All RAND research reports undergo rigorous peer review to ensure high standards for research quality and objectivity.
This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited; linking directly to this product page is encouraged. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial purposes. For information on reprint and reuse permissions, please visit www.rand.org/pubs/permissions.
RAND is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.