In yesterday’s post
, I showed how the conventional confidence interval can severely underestimate the probability that the interval covers the true parameter. This is because in panel B on the right, we can be quite certain where the hatch is: the two bubbles are nearly 10 meters apart, and given that the submarine is 10 meters long with the hatch exactly in the middle, the two bubbles tell us a lot. (If this isn’t clear, revisit yesterday’s post
Today I show that the conventional confidence interval can be even more misleading, by returning seemingly less
information about a parameter when we know more
. This is known as the “precision fallacy” and arises because the conventional interval does not take into account the likelihood;
that is, it ignores the information provided by the joint probability density of the observed data for all possible values of the parameter of interest (in this case, the hatch location).
In the figure above, non-zero values of the likelihood are shown by the thick blue bar. Wherever there is no bar, the likelihood is zero. We can determine the likelihood from the constraints embodied in the problem: in the left panel (A), the two bubbles are close together and hence the hatch could be at most nearly 5 meters away in either direction (because if the two bubbles were on top of each other and emerged from the very tip of the submarine, the hatch would be exactly 5 m either way—it couldn’t be any further away because it’s in the middle of a 10 m vessel). In the right panel (B), as we have already noted, the bubbles are so far apart that the hatch has to be in the middle, with relatively little uncertainty (if the bubbles emerged exactly 10 m apart, there would be zero uncertainty—you can verify this using the simulation
So what do conventional confidence interval do with this valuable information?
Instead, they give the appearance of confidence by considering the variance in the sample. Intuition—and widespread practice—suggests that the smaller the variance, the more precise the estimate of the parameter, and hence the greater our confidence in the true value of the parameter should be.
Unfortunately, the reverse is true in the current situation. This is because the “variance” among the two bubbles is minimal in panel A but maximal in panel B. Hence the appearance of precision is greater on the left than the right—even though we have nearly perfect knowledge available in the right-hand panel.
This is illustrated in the row labeled “Nonpara.” in the figure above, which also happens to be the conventional frequentist interval computed with the t-statistic. In other words, this is your textbook case where you compute the mean (of the two bubbles), and then the standard error of the mean according to s/√N (where N=2 here). The width of the confidence interval is thus directly proportional to the distance between the bubbles: large on the right and tiny on the left.
The implied presumed “precision” in panel A is entirely chimerical: the confidence interval is nice and tight but is cruising right past the true value. Conversely, in panel B we seem quite uncertain when in fact we could know pretty much exactly where the hatch is. (There is a nuance here which renders the use of s/√N questionable, and which explains why the row is called “Nonpara.” For expository purposes I am ignoring this infelicity.)
In summary: conventional frequentist confidence intervals tacitly convey images of precision (or lack thereof) that can run completely counter to the actual knowledge that is available in a given situation. Fortunately, therefore, the people who looked for the black box of Air France Flight 447
did not rely on frequentist statistics.
They instead relied on Bayesian search theory
. As did the U.S. Navy when it located the nuclear submarine USS Scorpion
The idea of Bayesian search theory is the same as for any other Bayesian statistical inference: You formulate as many reasonable hypotheses as possible (in the above figure, this is the space of possible locations of the hatch). You then construct the likelihood (the joint probability density of the observed data for all possible locations of the hatch). Finally, you infer the probability distribution of the location of the hatch from the likelihood and the presumed prior distribution of locations. In the case of the figure above, this simplifies to half the width of the range of non-zero likelihoods (because we are concerned with 50% confidence intervals and we assume a uniform prior).
Accordingly, in the figure above, the top row shows 50% Bayesian credible intervals that are half the width of the blue bars indicating the nonzero likelihoods. They are narrow in panel B and wide in panel A, as they should be.
It is for this reason that they are called credible intervals—because they are just that: credible.
Reference for the paper mentioned in this post:
Morey, R. D., Hoekstra, R., Rouder, J. N., Lee, M. D., & Wagenmakers, E.-J. (2015). The Fallacy of Placing Confidence in Confidence Intervals. Psychonomic Bulletin & Review
, doi: 10.3758/s13423-015-0947-8.