The Air Force Officer Qualifying Test: Validity, Fairness, and Bias
May 10, 2010
|PDF file||0.1 MB||
Use Adobe Acrobat Reader version 10 or higher for the best experience.
The Air Force relies on the Air Force Officer Qualifying Test (AFOQT) as part of its officer selection process. Despite concerns about the test, RAND's survey of existing literature concludes that the AFOQT is a good selection test that predicts important Air Force outcomes and is not biased against minorities or women. The Air Force would not benefit by replacing the AFOQT with the SAT, although other valid selection tools could be used to complement the AFOQT.
The Air Force has long recognized the importance of selecting the most qualified officers possible. In that spirit, the Air Force has relied on the Air Force Officer Qualifying Test (AFOQT) as one measure of those qualifications for more than 60 years.
Nevertheless, a variety of concerns have been raised about whether the test is fair, whether it is biased against minorities or women, whether it is too expensive, and whether it actually predicts anything important to the Air Force. Some have even suggested replacing the AFOQT with another test, such as the Scholastic Aptitude Test (SAT), in the hope that this step would lead to a more diverse Air Force population and would save administrative costs.
There is a body of scholarly and technical literature concerning the use of aptitude tests for academic and professional selections, but this work is rarely directed toward military policymakers. The Air Force asked RAND Project AIR FORCE (PAF) to review existing knowledge about the AFOQT and other selection tests and to examine the implications for the future of the AFOQT. PAF researchers drew the following major conclusions:
PAF's survey of literature about AFOQT and aptitude testing in general suggests the following:
The SAT may be a valid replacement for the verbal and quantitative portions of the AFOQT, but there are several reasons not to use it:
Aptitude is one of the most powerful predictors of later performance and hence one of the most useful; therefore, retaining some aptitude measure is essential. The most feasible and potentially least expensive way to increase diversity while retaining high validity in the selection system is to use aptitude measures along with additional measures, such as personality, that predict performance but do not show group differences.
Based on these conclusions, PAF recommends that the Air Force consider the following steps:
Use the AFOQT to its fullest and pursue other options for increasing diversity. Increasing officer diversity should continue to be a valued goal for the Air Force, but it should not come at the expense of selecting qualified candidates. Because the AFOQT is a valid predictor of success in Air Force jobs, it should continue to be used for selecting officers and candidates for aircrew jobs. Efforts to increase officer diversity should be directed at recruiting better-qualified minority and female candidates, not at eliminating a useful and valuable selection test. Replacing a valid and powerful predictor, such as the AFOQT, with a less-valid predictor to improve diversity is neither a necessary nor an acceptable alternative. Instead, valid measures that do not show group differences should be investigated to supplement the AFOQT.
Validate the entire officer and aircrew selection system. The AFOQT is just one piece of the overall officer and aircrew selection system. It is important to ensure that the tools used in addition to the AFOQT are valid predictors of success in Air Force jobs. Some of the other selection tools currently used by the accession sources (e.g., interviews and Relative Standing Scores) may not have been validated. Moreover, to achieve the goal of selecting the most-qualified applicants, the selection system as a whole should be examined for potential bias and should be validated.
Identify new selection tools to supplement the validity of the overall selection system. New selection tools (such as personality measures, biodata measures, and structured interviews) could be added to the selection system to improve accuracy and possibly produce marginal increases in diversity. Research studies on such experimental measures should be conducted to examine their usefulness in the Air Force context and to identify any possible adverse impact.
This report is part of the RAND Corporation Research brief series. RAND research briefs present policy-oriented summaries of individual published, peer-reviewed documents or of a body of published work.
This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited; linking directly to this product page is encouraged. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial purposes. For information on reprint and reuse permissions, please visit www.rand.org/pubs/permissions.
The RAND Corporation is a nonprofit institution that helps improve policy and decisionmaking through research and analysis. RAND's publications do not necessarily reflect the opinions of its research clients and sponsors.