Words of estimative probability

Words of estimative probability (WEP or WEPs) are terms used by intelligence analysts in the production of analytic reports to convey the likelihood of a future event occurring. They express the extent of their confidence in the finding. A well-chosen WEP provides a decision maker with an unambiguous estimate upon which to base a decision. WEP usage is not standard across the U.S. Intelligence Community (IC). Some intelligence and policy failures appear to be related to an imprecise use of estimative words.

A well-chosen WEP gives a decision maker a clear and unambiguous estimate upon which to base a decision. Ineffective WEPs are vague or misleading about the likelihood of an event. An ineffective WEP places the decision maker in the role of the analyst. The decision maker has to infer the prediction alone, thus increasing the likelihood of poor decision making or snap decision making.

Intelligence
In 1964 Sherman Kent, one of the first contributors to a formal discipline of intelligence analysis, addressed the problem of misleading expressions of odds in National Intelligence Estimates (NIE). In Words of Estimative Probability Kent distinguished between ‘poets’ (those preferring wordy probabilistic statements) from ‘mathematicians’ (those preferring quantitative odds). To bridge the gap between them and decision makers, Kent developed a paradigm relating estimative terms to odds. His goal was to "... set forth the community's findings in such a way as to make clear to the reader what is certain knowledge and what is reasoned judgment, and within this large realm of judgment what varying degrees of certitude lie behind each key judgment." Kent’s initiative was not adopted although the idea was well received and remains compelling today.

Policy and intelligence failures related to WEPs
An example of the damage that missing or vague WEPs can do is to be found in the President's Daily Brief (PDB), entitled Bin Laden Determined to Strike in US. The President’s Daily Brief is arguably the pinnacle of concise, relevant, actionable analytic writing in the IC. The PDB is intended to keep the President informed on a wide range of issues, the best analysts write it and senior leaders review it. This, the “August 6 PDB,” is at the center of much controversy for the USIC. The August 6 PDB began with not only a vague warning in the title, but also continued with vague warnings:


 * “Bin Ladin since 1997 has wanted to conduct terrorist attacks in the US” (CIA, 2001, para. 1);
 * “Bin Ladin implied...that his followers would ‘bring the fighting to America’” (CIA, 2001, para. 1);
 * Bin Ladin’s “attacks against...US embassies...in 1998 demonstrate that he prepares operations years in advance and is not deterred by setbacks” (CIA, 2001, para. 6);
 * “FBI information...indicates patterns of suspicious activity in this country consistent with preparations for hijackings or other types of attacks” (CIA, 2001, para. 10);
 * “a call to [the US] Embassy in the UAE in May [said] that a group of Bin Ladin supporters was in the US planning attacks with explosives” (CIA, 2001, para. 11).

The PDB described Bin Laden’s previous activities. It did not present the President with a critically important clear estimate of Bin Laden’s likely activities in the coming months: Bush had specifically asked for an intelligence analysis of possible al Qaeda attacks within the United States, because most of the information presented to him over the summer about al Qaeda focused on threats against U.S. targets overseas, sources said. But one source said the White House was disappointed because the analysis lacked focus and did not present fresh intelligence. The lack of appropriate WEPs would lead to confusion about the likelihood of an attack and to guessing about the time period in which it was likely to occur.

The language used in the memo lacks words of estimative probability (WEP) that reduce uncertainty, thus preventing the President and his decisionmakers from implementing measures directed at stopping al Qaeda’s actions

The consequences of the 9/11 and the Iraq/WMD intelligence failures, the 9/11 Commission and the Iraq Intelligence Commission, were the movers of structural reform of the Intelligence Community. Although these reforms intended to improve the functioning of the IC, particularly with respect to inter-agency cooperation and information sharing, they paid scant attention to improving the quality of analyses and intelligence writing. There is a pervasive feeling that this improvement is needed. However, there seems to be resistance in the IC, due in part to habit and in part to the reality of politics and the understandable preference for the ‘plausible deniability’ that less precise jargon offers.

Medicine
Physicians and clinical scientists face a very similar problem in obtaining informed consent for patients, where words such as "rare" or "infrequent" do have actual probabilities defined. Numerical information of this type, however, is rarely presented to patients.

A representative guide for obtaining informed consent from people participating in social science or behavioral research, or of the potential risks of a medical procedure, suggests giving typical numerical chances of an adverse event, when words of estimative probability first are used.

The guideline continues, For studies involving investigational agents, or experimental doses or combinations of drugs and/or treatments, subjects should be warned that there may be as yet unknown risks associated with the drug/treatment but that they will be advised if any new information becomes available that may affect their desire to participate in the study.

Reforms to methodology
Estimative statements can be improved in four ways; either by:


 * 1) Adding quantitative source reliability and confidence measures to estimative statements
 * 2) Complementing estimative statements with stochastic analyses
 * 3) Standardizing WEPs
 * 4) Standardizing WEPs and complementing estimative statements with ratings of source reliability and analytic confidence

Quantification of source reliability and analytic confidence
Michael Schrage, an advisor to the Massachusetts Institute of Technology’s (MIT) Security Studies Program, wrote, in a Washington Post editorial that requiring analysts to produce and include quantitative measures of source reliability and confidence along with their findings would reduce ambiguity. Yet Schrage also reported that former Interim Director of Central Intelligence John E. McLaughlin’s attempted to enact this at the CIA, but, like Kent’s initiative, it was not adopted. "Former acting CIA director and longtime analyst John McLaughlin tried to promote greater internal efforts at assigning probabilities to intelligence assessments during the 1990s, but they never took. Intelligence analysts 'would rather use words than numbers to describe how confident we are in our analysis,' a senior CIA officer who's served for more than 20 years told me. Moreover, 'most consumers of intelligence aren't particularly sophisticated when it comes to probabilistic analysis. They like words and pictures, too. My experience is that [they] prefer briefings that don't center on numerical calculation. That's not to say we can't do it, but there's really not that much demand for it.'"

Stochastic analysis
Since combining quantitative, probabilistic information with estimates is a successful in business forecasting, marketing, medicine, and epidemiology it should be implemented by the intelligence community as well. These fields have used probability theory and Bayesian analysis as forecasting tools. Using probability theory and other stochastic methods are appealing because they rely on rationality and mathematical rigor, are less subject to analytical bias, and such findings appear to be unambiguous. As an opposing argument it is indisputable that few analysts or intelligence consumers have the stomach for numerical calculation. Additionally, Bruce Blair, Director of the Center for Defense Information, a proponent of quantitative methods for the IC, points out; intelligence information from secret sources is often murky, and the application of advanced math is not sufficient to make it more reliable. However, he sees a place for stochastic analyses over a very long period, it “points to a fairly slow learning curve that also challenges the wisdom of making preemption a cornerstone of U.S. security strategy.” The reservations stated are significant: Mathematical and statistical analyses require a lot of work without rapid and necessarily commensurate gains in accuracy, speed or comprehension.

Standardization
The National Intelligence Council’s recommendations described the use of a WEP paradigm (Table 2) in combination with an assessment of confidence levels (“High, moderate, low”) based on the scope and quality supporting information:

However, the NIC’s own discussion of this paradigm seems to undercut its chances of being effective: Intelligence judgments pertaining to likelihood are intended to reflect the Community’s sense of the probability of a development or event. [...] We do not intend the term “unlikely” to imply an event will not happen. We use “probably” and “likely” to indicate there is a greater than even chance. We use words such as “we cannot dismiss,” “we cannot rule out,” and “we cannot discount” to reflect an unlikely—or even remote—event whose consequences are such it warrants mentioning. Words such as “may be” and “suggest” are used to reflect situations in which we are unable to assess the likelihood generally because relevant information is nonexistent, sketchy, or fragmented. This explanation is ‘a half-step forward, half-step backward.’ An agency-sponsored WEP paradigm is progress. However—an estimative statement that uses “maybe”, “suggest” , or other weasel words is vague and symptomatic of the problem at hand—not its solution. In 1964 Kent railed against the “restort to expressions of avoidance...which convey a definite meaning but at the same time either absolves us completely of the responsibility or makes the estimate enough removed ... as not to implicate ourselves.”

Mercyhurst experience with standardized WEPs
The Mercyhurst College Institute for Intelligence Studies conducted several experiments investigating the IC’s interpretation of WEPs (results varied) their use of WEPs in NIEs over the past three decades] to determine the significant changes in the ways the NIC has been articulating its intelligence judgments over time. See Analysis of Competing Hypotheses

Mercyhurst’s WEP paradigm reduces Kent’s schema to its least ambiguous words: Analytic Confidence and Source Reliability are expressed on a 1 to 10 scale, with 10 high.

Weasel words
Table 4 contains a non-exhaustive list of common terms that are especially vague, known pejoratively as Weasel Words. Their use in estimative statements is almost certain to cause confusion; they should be avoided at all costs.