Using Natural Language Processing to Code Patient Experience Narratives

Capabilities and Challenges

Daniel Ish, Andrew M. Parker, Osonde A. Osoba, Marc N. Elliott, Mark Schlesinger, Ron D. Hays, Rachel Grob, Dale Shaller, Steven C. Martino

ResearchPublished Oct 7, 2020

Patient narratives about experiences with health care contain a wealth of information about what is important to patients. These narratives are valuable for both identifying strengths and weaknesses in health care and developing strategies for improvement. However, rigorous qualitative analysis of the extensive data contained in these narratives is a resource-intensive process, and one that can exceed the capabilities of human analysts. One potential solution to these challenges is natural language processing (NLP), which uses computer algorithms to extract structured meaning from unstructured natural language. Because NLP is a relatively new undertaking in the field of health care, the authors set out to demonstrate its feasibility for organizing and classifying these data in a way that can generate actionable information.

In doing so, the authors focused on two steps that must be performed by a machine learning (ML) system designed to classify narratives into such codes as those typically applied by human coders (e.g., positive or negative statements regarding care coordination). These steps are (1) numerically representing the text data (in this case, entire narratives as they are provided by patients) and (2) classifying the data by codes based on that representation. The authors also compared four related approaches to deploying ML algorithms, identified potential pitfalls in the processing of data, and showed how NLP can be used to supplement and support human coding.

Key Findings

  • The success of the fairly simple models described in this pilot study supports the promise of these approaches for analyzing patient narratives at larger scale.
  • There is labor-saving potential in leveraging the strengths of both machine and human coders, potentially in creative ways.
  • Coding performance was significant even with relatively off-the-shelf computing equipment and routines and would likely improve with even modest computing investments.
  • Perhaps the most obvious opportunity for additional investment is increasing the size of the data set on which to train the models, which the authors expect would improve performance.
  • Efficiency may be gained by contracting model building to specialized companies.
  • Broad stakeholder discussions could help coordinate use of NLP for patient narratives.

Topics

Document Details

Citation

RAND Style Manual
Ish, Daniel, Andrew M. Parker, Osonde A. Osoba, Marc N. Elliott, Mark Schlesinger, Ron D. Hays, Rachel Grob, Dale Shaller, and Steven C. Martino, Using Natural Language Processing to Code Patient Experience Narratives: Capabilities and Challenges, RAND Corporation, RR-A628-1, 2020. As of October 10, 2024: https://www.rand.org/pubs/research_reports/RRA628-1.html
Chicago Manual of Style
Ish, Daniel, Andrew M. Parker, Osonde A. Osoba, Marc N. Elliott, Mark Schlesinger, Ron D. Hays, Rachel Grob, Dale Shaller, and Steven C. Martino, Using Natural Language Processing to Code Patient Experience Narratives: Capabilities and Challenges. Santa Monica, CA: RAND Corporation, 2020. https://www.rand.org/pubs/research_reports/RRA628-1.html.
BibTeX RIS

Research conducted by

The research described in this report was prepared for the Agency for Healthcare Research and Quality (AHRQ) and conducted by RAND Health Care.

This publication is part of the RAND research report series. Research reports present research findings and objective analysis that address the challenges facing the public and private sectors. All RAND research reports undergo rigorous peer review to ensure high standards for research quality and objectivity.

This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited; linking directly to this product page is encouraged. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial purposes. For information on reprint and reuse permissions, please visit www.rand.org/pubs/permissions.

RAND is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.