Category Archives: Probability

Sleeping bookie

I’ve spent more of this week than is healthy thinking about the Sleeping Beauty puzzle (thanks in large part to this really interesting post by Kenny). I don’t think I’ve got anything terribly novel to say, but I thought I’d set out my current thinking to see if people agree with my take on what the dialectic is on at least one aspect of the puzzle.

Sleeping Beauty is sent to sleep by philosophical experimenters. He (for, in a strike for sexual equality, this Beauty is male) will be woken up on Monday morning, told on Monday afternoon what day it is, and sent to sleep again after being given a drug which will mean that the next time he wakes up, he will have no memories of what transpired. Depending on the result of a fair coin flip, he will either be woken up in exactly similar circumstances on Tuesday morning, or be left to sleep through the day. Beauty is aware of the setup.

How confident should Beauty be on Monday morning that the coin to be flipped in a few hours will land heads (remember, he knows it’s a fair coin). Halfers say: he should have credence 1/2 that it’ll be heads. Thirders say: the credence should be 1/3. (All sides agree that on Sunday his credence should be 1/2).

What I’m interested in is whether there are Dutch book arguments for either view. The very simplest takes the following form. Sell Beauty a [$30,T] bet for $15 on Sunday evening. Then, if Beauty’s a halfer, on Monday and (if awoken) Tuesday mornings, sell him [$20,H] bets on each awakening for $10.

If H obtains, Beauty loses the first bet but wins the sole remaining bet (on Monday morning), for a net loss of $5. If T obtains, Beauty wins the first bet, but loses the next two, for a net loss of $5 again. So Beauty is guaranteed to lose money.

This is in some sense a diachronic dutch book. But as several people note, it’s not a particularly convincing argument that there’s something wrong with Beauty being a halfer. For notice that the information here is asymmetric: the bookie offering the bets needs to have more information than Beauty, since it is crucial to their strategy to offer twice as many bets if the result of the coin flip is tails, than if it is heads.

Hitchcock aims to give a revised Dutch book argument for the same conclusion that avoids this problem. He suggests that the experimenters put the bookie through the same procedure as they put Beauty through, and the bookie’s strategy should then simply be to offer Beauty the bets every time they both wake. That has the net effect of offering the same set of bets as above for a sure loss for Beauty, but the bookie and Beauty are in the same epistemic state. This is the sleeping bookie argument.

What I’d like to claim (inspired by Bradley and Leitgeb) is that if we concentrate too much on the epistemic state of hypothetical bookies, we’ll get led astray. Looking at the overall mechanism whereby bets are offered to Beauty, we initially described this as one where an agent (bookie) is offering bets to Beauty each time they are both awake. But I’d prefer to describe this as a case where a complex agency (the bookie and the experimenters in cahoots) are offering bets to Beauty. The second description seems at least as good as the first: after all, without the compliance of the experimenters, the bookie’s dutch book strategy can’t be implemented. But the system constituted by the experimenters and the bookie clearly has access to the information about the result of the coin toss, and arranges for the bets to be made appropriately, even though the bookie alone lacks this information.

Now dutch book arguments are only as good as the results we can extract from them about what credences are rational to have in given circumstances. And clearly, if Beauty knows that the bets coming at him encode information about the outcome on which the bet turns, then he needn’t (perhaps shouldn’t) simply bet according to his credences, but adjust them to take into account the encoded information. That’s why, to get a fix on what Beauty’s credences are, we put a ban on the bookie having excess information. That’s why the first dutch book argument for thirding looks like a bad way to get a fix on what Beauty’s credences are. But this rationale for forbidding the bookie from having excess information generalizes, so that we shouldn’t trust dutch books in any situation where the mechanism whereby bets are offered (whether in the hands of a single individual, or a system) relies on information about the outcome on which the bet turns. (Equally, if the bookie had extra information, but the system of bets doesn’t exploit this in any way, there’s as yet no case against trusting the dutch book argument, it seems to me.)

The moral I take from all this is that what’s going on in the head of some individual we deign to call “bookie” is neither here nor there: what matters is the pattern of bets and whether that pattern exploits information about the outcomes on which the bet turns. This is effectively what I take Bradley and Leitgeb to argue for in their very nice article. What they suggest (roughly) is that a necessary condition on taking a dutch book argument to give a fix on rational credences, is that the pattern of bets be uncorrelated with the outcomes on which the bets turn. I conjecture (tentatively), that this is really what the ban on bookie’s having extra information was trying to get at all along. The upshot is that Hitchcock’s sleeping bookie argument is problematic in the same way as the initial dutch book argument against halfers.

But more than this. If we refocus attention on the issue of the goodstanding of the pattern of bets, rather than the epistemic states of hypothetic bookies, we can put together a dutch book argument against thirders. For suppose that the experimenters offer Beauty a [$30,H] bet for $15 on Sunday, and then a genuine bet of [$30,T] for $20 on Monday morning no matter what happens, and (so he can’t tell what’s going on) a fake bet where he’ll automatically get his stake returned, apparently of [$30,T] for $20 on Tuesday. Then he’ll be guaranteed a loss of $5 no matter what happens. Of course, the experimenters here have knowledge of the outcomes. But (arguably) that doesn’t matter, because the bets they offer are uncorrelated with the outcomes of the event on which the bets turn: the system of bets offered is the same no matter what the outcome is, so (it seems to me) the information that the experimenters have isn’t implicit in the pattern of bets in any sense. So I think there’s a better dutch book argument against thirding, than there is against halfing. (Or at least, I’d be interested in seeing the case against this in detail).

All this is not to say that the halfer is out of the woods. A quite different dutch book argument is given in a paper by Draper and Pust, which exploits the standard halfer’s story (Lewis’s) about what happens on Monday afternoon, once Beauty has been told what day it is. The Lewisian halfer thinks that once Beauty realizes its Monday, he should have credence 2/3 that Heads is the result. And that, it appears, is a dutch-bookable situation.

Notice that this isn’t directly an argument against the thesis that Beauty should have credence 1/2 in Heads on Monday morning. It is, in effect, an argument that he should also have credence 1/2 in Heads on Tuesday. And, with a few other widely accepted assumptions, these combine to give rise to a contradiction (see for example, Cian Dorr’s presentation of the Beauty case as a paradox).

If this is all we say, then we should conclude that we really do have here a puzzling argument for a contradiction, where all the premises look pretty plausible and the two crucial ones both seem prima facie defensible via dutch book strategies. Maybe, as some suggest, we should revise our claims about updating of credences to make halfing in both circumstances appropriate: or maybe there’s something unavoidably irrational in Beauty’s predicament. What will finally come out in the wash as the best response to the puzzle is one matter; whether the dutch book considerations support halfing or thirding on Monday morning is another; and it is only on this narrow point that I’m claiming that there is a pro tanto case to be a halfer.


Probabilistic multi-conclusion validity

I’ve been thinking a bit recently about how to generalize standard results relating probability to validity to a multi-conclusion setting.

The standard result is the following (where the uncertainty of p is 1-probability of p):

An argument is classically valid
for all classical probability functions, the sum of the uncertainties of the premises is at least as great as the uncertainty of the conclusion.

It’ll help if we restate this as follows:

An argument is classically valid
for all classical probability functions, the sum of the uncertainties of the premises + the probability of the conclusion is at least 1.

Stated this way, there’s a natural generalization available:

A multi-conclusion argument is classically valid
for all classical probability functions, the sum of the uncertainties of the premises + the probabilities of the conclusions is greater than or equal to 1.

And once we’ve got it stated, it’s a corollary of the standard result (I believe).
It’s pretty easy to see directly that this works in the “if” direction, just by considering classical probability functions which only assign 1 or 0 to propositions.

In the “only if” direction (writing u for uncertainty and p for probability)

Consider A,B|=C,D. This holds iff A,B,~C,~D|= holds by a standard premise/conclusion swap result. And we know u(~C)=p(C), u(~D)=p(D). By the standard result, the sum of uncertainties of the premises of a single-conclusion argument must be greater than that of the conclusion. That is, the single-conc argument holds iff u(A)+u(B)+u(~C)+u(~D) is greater than equal to 1. But by the above identification, this holds iff u(A)+u(B)+p(C)+p(D) is greater than or equal to 1. This should generalize to arbitrary cases. QED.

Thresholds for belief

I’m greatly enjoying reading David Christensen’s Putting logic in its place at the moment. Some remarks he makes about threshold accounts of the relationship between binary and graded beliefs seemed particularly suggestive. I want to use them here to suggest a certain picture of the relationship between binary and graded belief. No claim to novelty here, of course, but I’d be interested to hear about worries about this specific formulation (Christensen himself argues against the threshold account).

One worry about threshold accounts is that they’ll make constraints on binary beliefs look very weird. Consider, for example, the lottery paradox. I am certain that someone will win, but for each individual ticket, I’m almost certain that it’s a loser. Suppose that having belief of degree n sufficed for binary belief. Then, by choosing a big enough lottery, we can make it that I believe a generalization (there will be a winner) while believing the negation of each of its premises. So I believe each of a logically inconsistent set.

This sort of situation is very natural from the graded belief perspective: the beliefs in question meet constraints of probabilistic coherence. But there’s a strong natural thought that binary beliefs should be constrained to be logically consistent. And of course, the threshold account doesn’t deliver this.

What Christensen points to is some observations by Kyburg about limited consistency results that can be derived from the threshold account. Minimally, binary beliefs are required to be weakly consistent: for any threshold above zero, one cannot believe a single contradictory proposition. But there are stronger results too. For example, for any threshold above 0.5, one cannot believe a pair of mutually contradictory propositions. One can see why this is if one remembers the following result: that a logically valid argument is such that the improbability of its conclusion cannot be greater than the sum of the improbabilities of its premises. For the case where the conclusion is absurd (i.e. the premises are contradictory) we get the the sum of the improbabilities of the premises must be less than or equal to 1.

In general, then, what we get is the following: if the threshold for binary belief is at least 1-1/n, then one cannot believe each of an inconsistent set of n propositions.

Here’s one thought. Let’s suppose that the threshold for binary belief is context dependent in some way (I mean here to use this broadly, rather than committing to some particularly potentially controversial semantic analysis of belief attributions). The threshold that marks the shift to binary belief can vary depending on aspects of the context. The thought, crudely put, is that there’ll be the following constraint on what thresholds can be set: in a context where n propositions are being entertained, then the threshold for binary belief must be at least 1-1/n.

There is, of course, lots to clarify about this. But notice that now relative to every context, we’ll get logical consistency as a constraint on the pattern of binary belief (assuming that to belief that p is in part to entertain that p).

[As Christensen emphasises, this is not the same thing as getting closure holding in every context. Suppose we consider the three propositions, A, B, and A&B. Consistency means that we cannot accept the first two and accept the negation of the last. And indeed, with the threshold set at 2/3, we get this result. But closure would tell us that every situation in which we believe the first two, we should believe the last. But it’s quite consistent to believe A and B (say, by having credence 2/3 in each) and to fail to believe A&B (say, by having credence 1/3 in this proposition). Probabilistic coherence isn’t going to save the extendability of beliefs by deduction, for any reasonable choice of threshold.

Of course, if we allow a strong notion of disbelief or rejection, such that someone disbelieves that p iff their uncertainty of p is past the threshold (the same threshold as for belief), then we’ll be able to read off from the consistency constraint that in a valid argument, if one believes the premises, one should abandon disbelief in the conclusion. This is not closure, but perhaps it might sweeten the pill of giving up on closure.]

Without logical consistency being a pro tanto normative constraint on believing, I’m sceptical that we’re really dealing with a notion of binary belief at all. Suppose this is accepted. Then we can use the considerations above to argue (1) that if the threshold account of binary belief is right, then thresholds (if not extreme) must be context dependent, since for no choice of threshold less than 1 will consistency be upheld. (2) that there’s a natural constraint on thresholds in terms of the number of propositions obtained.

The minimal conclusion, for this threshold theorist, is that the more propositions they entertain, the harder it will be for them to count as beliefs. Consider the lottery paradox construed this way:

1 loses

2 loses

N loses

So: everyone loses

Present this as the following puzzle: We can believe all the premises, and disbelieve the conclusion, yet the latter is entailed by the former.

We can answer this version of the lottery paradox using the resources described above. In a context where we’re contemplating this many propositions, the threshold for belief is so high that we won’t count as believing the individual props. But we can explain why it seems so compelling: entertain each individually, and we will believe it (and our credences remain fixed throughout).

Of course, there’s other versions of the lottery paradox that we can formulate, e.g. relying on closure, for which we have no answer. Or at least, our answer is just to reject closure as a constraint on rational binary beliefs. But with a contextually variable threshold account, as opposed to a fixed threshold account, we don’t have to retreat any further.

Chances, counterfactuals and similarity

A happy-making feature of today is that Philosophy and Phenomenological Research have just accepted my paper “Chances, Counterfactuals and Similarity”, which has been hanging around for absolutely ages, in part because I got a “revise and resubmit” just as I was finishing my thesis and starting my new job, and in part because I got so much great feedback from a referee that there was lots to think about.

The way I think about it, it is a paper in furtherance of the Lewisian project of reducing counterfactual facts to similarity-facts between worlds, which feeds into a general interest in what kinds of modal structure (cross-world identities, metrics and measures, stronger-than-modal relations etc) you need to appeal to for metaphysical purposes. Lewis has a distinctive project of trying to reduce all this apparent structure to the economical basis of de dicto modality — what’s true at this world or that — and (local) similarity facts. Counterpart theory is one element of this project: showing how cross-world identities might be replaced by similarity relations and de dicto modality. Another element is the reduction of counterfactuals to closeness of worlds, and closeness of worlds is ultimately cashed out in terms of one world’s fitting another’s laws, and there being large areas where the local facts in each world match exactly. Again, we find de dicto modality of worlds and local similarity at the base.

Lewis’s main development of this view looks at a special case, where the actual world is presupposed to have deterministic laws. But to be general (and presumably, to be applicable to the actual world!) we want to have an account that holds for the situation where the laws of nature are objective-chance-laws. Lewis does suggest a way of extending his account to the chancy case. It’s attacked by Hawthorne in a recent paper—ultimately successfully, I think. In any case, Lewis’s ideas in this area always looked (to me) like a bit of a patch-up job, so I suggest a more principled Lewisian treatment, which then avoids the Hawthorne-style objections to the Lewis original.

The basic thought (which I found in Adam Elga’s work on Humean laws of nature) is that “fitting” chancy laws of nature is not just a matter of not violating those laws. Rather, to fit a chancy law is to be objectively typical relative to the probability function those laws determine. Given this understanding, we can give a single Lewisian account of what comparative similarity of worlds amounts to, phrased in terms of fit. The ambition is that when you understand “fit” in the way appropriate to deterministic laws, you get Lewis’s original (unextended) account. And when you understand “fit” in the way I argue is appropriate to chancy laws, you get my revised suggestion. All very satisfying, if you can get it to work!