Most if not all simulations are driven by human judgments — judgments about when and what human decisions would be made, how well people and machines would perform tasks, and even the outcomes of purely physical interactions, to mention a few. These judgments are incorporated in a simulation (often so deeply and subtly embedded as to obscure their origin) and importantly affect the simulation's outcomes. Thus, the value of a simulation in support of training or analysis depends in large part on the validity of the human judgment inputs. This paper discusses criteria for human judgment validation. In particular, it describes how commonly used procedures yield only "face" validity and compares them with modern measurement procedures that test exploratory theories of causal factors underlying judgments. It describes why the first criterion is unacceptable and demonstrates how interpretations of judgments can be tested. It further proposes that interpretations undergo validity tests before incorporation into simulations. Finally, it uses an operational example to describe how modern measurement is applied to model soldiers firing decisionmaking and how those models are incorporated in a combat simulation to represent engagement decisions by computer generated forces.