Download Free Electronic Document

FormatFile SizeNotes
PDF file 0.1 MB

Use Adobe Acrobat Reader version 10 or higher for the best experience.

Research Brief

Abstract

The Air Force relies on the Air Force Officer Qualifying Test (AFOQT) as part of its officer selection process. Despite concerns about the test, RAND's survey of existing literature concludes that the AFOQT is a good selection test that predicts important Air Force outcomes and is not biased against minorities or women. The Air Force would not benefit by replacing the AFOQT with the SAT, although other valid selection tools could be used to complement the AFOQT.

The Air Force has long recognized the importance of selecting the most qualified officers possible. In that spirit, the Air Force has relied on the Air Force Officer Qualifying Test (AFOQT) as one measure of those qualifications for more than 60 years.

Nevertheless, a variety of concerns have been raised about whether the test is fair, whether it is biased against minorities or women, whether it is too expensive, and whether it actually predicts anything important to the Air Force. Some have even suggested replacing the AFOQT with another test, such as the Scholastic Aptitude Test (SAT), in the hope that this step would lead to a more diverse Air Force population and would save administrative costs.

There is a body of scholarly and technical literature concerning the use of aptitude tests for academic and professional selections, but this work is rarely directed toward military policymakers. The Air Force asked RAND Project AIR FORCE (PAF) to review existing knowledge about the AFOQT and other selection tests and to examine the implications for the future of the AFOQT. PAF researchers drew the following major conclusions:

The AFOQT Is a Valuable and Useful Test

PAF's survey of literature about AFOQT and aptitude testing in general suggests the following:

  • The AFOQT is a valid predictor of important outcomes (such as training success) across a variety of jobs. Studies show a statistically significant correlation between test scores and training performance.
  • On average, women and minorities tend to score lower on the AFOQT, resulting in lower selection rates for minorities and women. Thus, use of the AFOQT tends to reduce diversity within the Air Force. However, this finding is not unique to the AFOQT; research on other valid measures of aptitude show similar race and gender differences in test scores.
  • As an accurate predictor of training performance, the AFOQT is not biased or discriminatory under the law. Bias occurs when one group's test scores predict performance differently than another's scores do. For example, a test is biased against a group if members of that group later show better job performance than members of other groups with the same score. Studies of the AFOQT show that the test is not biased against minorities or women. If anything, the test is actually biased slightly in their favor.
  • The cost of maintaining and refining the AFOQT is a potential drawback to continuing it. However, this cost is not prohibitive; one estimate for test development costs is $2 million every eight years. While this is not a paltry sum, it is relatively inexpensive compared with Air Force personnel initiatives in general.

The SAT Is Not an Ideal Replacement for the AFOQT

The SAT may be a valid replacement for the verbal and quantitative portions of the AFOQT, but there are several reasons not to use it:

  1. The predictive power of an SAT score taken prior to entering the Air Force Academy or a college or university with a Reserve Officer Training Corps (ROTC) program is not likely to be as large as the predictive power of a test taken just prior to officer training.
  2. The SAT does not test for skills such as instrument reading and aviation knowledge, which are captured in the AFOQT. The Air Force would still need to develop and administer tests for these skills for the purpose of selecting pilots and combat system operators, thus diminishing the potential cost savings of using the SAT.
  3. The Air Force lacks control over future changes to SAT content, which are driven by the needs of educational institutions rather than those of the workplace.
  4. On average, women and minorities tend to score lower on the SAT, just as they do on the AFOQT. Therefore, the SAT (or similar aptitude measures) would be no better than the AFOQT at increasing diversity in the Air Force population.

Other Ways to Improve Prediction and Diversity Are Available

Aptitude is one of the most powerful predictors of later performance and hence one of the most useful; therefore, retaining some aptitude measure is essential. The most feasible and potentially least expensive way to increase diversity while retaining high validity in the selection system is to use aptitude measures along with additional measures, such as personality, that predict performance but do not show group differences.

Policy Recommendations

Based on these conclusions, PAF recommends that the Air Force consider the following steps:

Use the AFOQT to its fullest and pursue other options for increasing diversity. Increasing officer diversity should continue to be a valued goal for the Air Force, but it should not come at the expense of selecting qualified candidates. Because the AFOQT is a valid predictor of success in Air Force jobs, it should continue to be used for selecting officers and candidates for aircrew jobs. Efforts to increase officer diversity should be directed at recruiting better-qualified minority and female candidates, not at eliminating a useful and valuable selection test. Replacing a valid and powerful predictor, such as the AFOQT, with a less-valid predictor to improve diversity is neither a necessary nor an acceptable alternative. Instead, valid measures that do not show group differences should be investigated to supplement the AFOQT.

Validate the entire officer and aircrew selection system. The AFOQT is just one piece of the overall officer and aircrew selection system. It is important to ensure that the tools used in addition to the AFOQT are valid predictors of success in Air Force jobs. Some of the other selection tools currently used by the accession sources (e.g., interviews and Relative Standing Scores) may not have been validated. Moreover, to achieve the goal of selecting the most-qualified applicants, the selection system as a whole should be examined for potential bias and should be validated.

Identify new selection tools to supplement the validity of the overall selection system. New selection tools (such as personality measures, biodata measures, and structured interviews) could be added to the selection system to improve accuracy and possibly produce marginal increases in diversity. Research studies on such experimental measures should be conducted to examine their usefulness in the Air Force context and to identify any possible adverse impact.

Research conducted by

This report is part of the RAND Corporation research brief series. RAND research briefs present policy-oriented summaries of individual published, peer-reviewed documents or of a body of published work.

Permission is given to duplicate this electronic document for personal use only, as long as it is unaltered and complete. Copies may not be duplicated for commercial purposes. Unauthorized posting of RAND PDFs to a non-RAND Web site is prohibited. RAND PDFs are protected under copyright law. For information on reprint and linking permissions, please visit the RAND Permissions page.

The RAND Corporation is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.