The Improving Medicare Post-Acute Care Transformation (IMPACT) Act of 2014 aims to improve post-acute care (PAC) reporting and services by requiring collection, transmission, and reporting of standardized patient/resident assessment data across the four PAC settings—Home Health Agencies (HHAs), Inpatient Rehabilitation Facilities (IRFs), Long-Term Care Hospitals (LTCHs), and Skilled Nursing Facilities (SNFs). The Centers for Medicare & Medicaid Services (CMS) contracted with the RAND Corporation to identify and/or develop standardized items to include in the PAC patient assessment instruments. RAND was tasked by CMS with developing and testing items within seven areas of focus that fall under the assessment categories and domains delineated in the IMPACT Act: (1) vision and hearing; (2) cognitive status; (3) depressed mood; (4) pain; (5) care preferences; (6) medication reconciliation; and (7) bladder and bowel continence.
This article presents results of the first Alpha 1 feasibility test of a proposed set of items for assessing each of these focus areas. Conducted between August and October 2016, the test is one of two Alpha tests that will be completed by mid-2017 to assess the feasibility of proposed items. The results of these small-scale feasibility tests will be combined with ongoing stakeholder feedback to inform a national Beta test designed to determine how well the items perform when implemented across PAC settings.
For this test, the Alpha 1 instrument was completed with 133 patients: 37 receiving services in an HHA setting, 34 in an IRF, 31 in an LTCH, and 31 in an SNF. All patients were assessed twice (once by a facility nurse and once by a research nurse) so that agreement between these two assessors, or interrater reliability (IRR), could be determined.
We briefly summarize the metrics used to assess items in the Alpha 1 test before summarizing results for various items. All items were assessed on at least two metrics: IRR and feasibility. IRR was calculated using Cohen's kappa for categorical variables or weighted kappa for ordinal variables, as appropriate.1 Feasibility was assessed with the time required to complete the items for each content area, as well as qualitative feedback provided by assessors on the clarity of item instructions and difficulties they encountered with item administration. Some content areas had additional goals for assessment, such as fidelity of assessors in skipping certain items as required by the instructions, or sources of data that assessors used to complete the items.
Results of the Alpha 1 Test
The seven content areas assessed in the Alpha 1 test can be divided into three categories:
- The first category, which included depressed mood, pain, and bladder and bowel continence, worked nearly perfectly: The items were both reliable and feasible to implement. They will require little if any modification and can proceed to Beta testing in their current form.
- The second category—for vision and hearing, cognitive status, and care preferences—evidenced minor issues with reliability or feasibility. Some of the items in these content areas will require minor adjustments, and they may need to be retested in the Alpha 2 phase to prepare them for Beta testing.
- The third category, which consisted of medication reconciliation, was being tested for the first time. Not surprisingly, the medication reconciliation items demonstrated the most limitations with respect to reliability and feasibility. This content area will require considerable refinement and retesting in the Alpha 2 phase to prepare for Beta testing.
Category 1: Depressed Mood, Pain, and Bladder and Bowel Continence
IRR. We assessed IRR for this group of items using Cohen's kappa for categorical variables and weighted kappa for ordinal variables. IRR on mood and pain items was always above 0.9 and frequently perfect. Bladder and bowel continence items also generally achieved high IRR, although some of these items (e.g., catheter use) are not applicable to a majority of patients. Rates of bowel and bladder continence issues varied considerably across care settings.
Feasibility. The items in this group were highly feasible to administer and took minimal time to complete. The mood items took between one and three minutes to administer if only the Patient Health Questionnaire (PHQ)-2 was needed, and up to six minutes if the more comprehensive PHQ-9 was also needed. The pain items took approximately one minute to administer when there was no pain, and approximately three minutes to administer when there was pain. The bladder and bowel items took less than two minutes for interview items; the time to complete noninterview items varied widely across care settings.
Assessor feedback. Assessors suggested minor improvements in these content areas. They wanted additional training about how to score the PHQ-2 and the PHQ-9. Assessors reported that the bladder and bowel continence items were straightforward and easy to use. However, they suggested ways to improve them. For example, some assessors suggested that the word continence can be hard for some patients to understand and favored using a more easily understood term (e.g., accidents or leaking) when discussing bladder and bowel items with patients/residents. Assessors also asked for clarification on how to score bladder and bowel continence items when family and patient accounts do not match.
The items from these three content areas have been used in PAC settings for many years, and most have undergone extensive prior testing, validation, and use. They will require few, if any, modifications in future rounds of testing.
Category 2: Vision and Hearing, Cognitive Status, and Care Preferences
IRR. The results for vision and hearing indicate moderate to almost perfect agreement between assessors as measured by IRR. Highest reliability was recorded for use versus nonuse of a hearing aid (Cohen's kappa = 0.92); reliability was somewhat lower for use versus nonuse of glasses (0.69). Cognitive status items generally showed excellent IRR, and assessors came to very similar decisions about whether to skip to the next item in the instrument. The care preferences section had perfect or near-perfect IRR on all items.
Feasibility. Assessors reported that the items were easy to administer. The hearing and vision items each took about one minute to complete. Completing the cognitive section took approximately four minutes, and administering the care preferences items took six to seven minutes.
Assessor feedback. Assessors felt that both the item wording and the scoring criteria for some vision and hearing items needed clarification. There was some confusion about at what point in the assessment patients/residents should be asked the date of their most recent vision or hearing test. The confusion suggests a need to clarify the instructions, user manual, and training materials for these items.
Assessors noted some challenges in administering cognitive status items and asked for additional clarification about how to administer and score the Trail Making Task, the Complex Sentence Repetition Task, and the Serial 7's Task.
Assessors had two specific suggestions for improving the care preferences section. They requested wording changes to Item A4b (regarding who should be involved in making health care decisions for the patient) to make it easier for patients to understand. Second, for Items A3a–A3d (“How important is it to you to be physically active…to be mentally or intellectually involved…to be emotionally healthy…to be socially involved?”), assessors recommended collapsing the levels “not very important” and “not very important at all” because patients had difficulty distinguishing between these shades of meaning.
Category 3: Medication Reconciliation
Items in the third group, which focused solely on medication reconciliation, were being tested for the first time.
IRR. The medication reconciliation category had by far the lowest IRR of any of the items being tested. Kappas for paired assessments were below 0.3 on many items.
Feasibility. Assessors found these items challenging to complete, and they took substantially more time to complete than items in other areas: about 15 to 20 minutes, on average.
Assessor feedback. Assessors commented that the medication reconciliation items, more than any other area, required a “learning curve.” They found that their ability to locate the necessary information relatively quickly and accurately improved over time. Assessors identified several opportunities for improvement, including unclear instructions for Item B1 (“Did the post-acute care provider obtain lists of current medications from more than one information source?”), the compound structure of Item B6 (“Did the post-acute care provider address all high-risk discrepancies or potential adverse drug events within…”), and insufficient information in the instructions to determine which medications would be considered high-risk in some cases. Part of the challenge for this content area was the need to look for information across multiple data sources. The data sources that were used to answer the questions varied considerably across patients, assessors, and care settings.
The items in medication reconciliation will require the most refinement in future testing. We will revise the items using the quantitative results and qualitative feedback described in this article. Specifically, revised items will focus on discrepancies between medication lists rather than potential adverse drug events, which Alpha testing found were difficult to identify without clinical judgment. Information about specific drug classes will be sought as well. As assessors gain more experience completing these items, it is likely that the time needed to complete them will decrease somewhat, and the accuracy (and therefore the IRR) may improve. We have also reduced the number of items in this content area, further reducing administration time.
In sum, the medication reconciliation items have been refined and recast into 12 patient-focused assessment items that will be retested.
The Alpha 1 testing phase was successfully completed, in that all items were pilot tested among 133 patients. Items from all content areas were assessed on IRR and feasibility; items from some content areas were assessed on other metrics. Items have now been revised, when necessary, based on the findings of the Alpha 1 test. Alpha 2 testing is under way with the updated, revised items.