I was chatting to Rich Woodward earlier today about Jonathan Bennett‘s attitude to counterfactuals about chancy events. I thought I’d put down some of the thoughts I had arising from that conversation.
The basic thought is this. Suppose that on conditional that A were to happen, it would be overwhelmingly likely that B—but not probability 1 that B would occur. Take some cup I’m holding—if I were to drop it out the window, it’s overwhelmingly likely that it would fall to the floor and break, rather than shoot off sideways or quantum tunnel through the ground. But (we can suppose) there’s a non-zero—albeit miniscule—chance that the latter things would happen. (You don’t need to go all quantum to get this result—as Adam Elga and Barry Loewer have emphasized recently, if we have counterfactuals about macroevents, the probabilities involved in statistical mechanics also attribute tiny but nonzero probability to similarly odd things happening).
The question is, how should we evaluate the counterfactual “Drop>Break” taking into account the fact that given that Drop, there’d be a non-zero but tiny chance that ~Break?
Let’s take as our starting point a Lewisian account of of the counterfactual—“A>B” is to be true (at w) iff B is true at all the closest A-worlds to B. Then the worry many people have is that though the vast majority of closest possible Drop-worlds will be Break worlds, there’ll be a residual tiny minority of worlds where it won’t break—where quantum tunnelling or freaky statistical mechanical possibilities are realized. But since Lewis’s truth-conditions require that Break be true at *all* the closest Drop-worlds, even that tiny minority suffices to make the counterfactual “Drop>Break” false.
As goes “Drop>Break”, so goes almost every ordinary counterfactual you can think of. Almost every counterfactual would be false, if the sketch just given is right. Some people think that’s the right result. We’ll come back to it below.
Lewis’s own response is to deny that the freaky worlds are among the closest worlds. His idea is that freakiness (or as he calls it, the presence of “quasi-miracles”) itself is one of the factors that pushes worlds further away from actuality. That’s been recently criticised by John Hawthorne among others. I’m about to be in print defending a generally Lewisian line on these matters—though the details are different from Lewis’s and (I hope) less susceptible to counterexample.
But if you didn’t take that line, what should you say about the case? A tempting line of thought is to alter Lewis’s clause—requiring not truth at all the closest worlds but truth at most, or the overwhelming majority of them. (Of course, this idea presumes it makes sense to talk of relative proportions of worlds—let’s spot ourselves that).
This has a marked effect on the logic of counterfactuals—in particular, the agglomeration rule (A>B, A>C, therefore A>B&C) would have to go (Hawthorne points this out in his discussion, IIRC). To see how this could happen, suppose that there are 3 closest A-worlds, and X needs to be true at 2 of them in order for “A>X” to be true. Then let the worlds respectively be B&C, ~B&C, ~C&B-worlds. This produces a countermodel to agglomeration.
Agglomeration strikes me as a bad thing to give up. I’m not sure I have hugely compelling reasons for this, but it seems to me that a big part of the utility of counterfactuals lies in our being able to reason under a counterfactual supposition. Given agglomeration you can start by listing a bunch of counterfactual consequences (X, Y, Z), reason in standard ways (e.g. perhaps X, Y, Z entail Q) and then conclude that, under that counterfactual supposition, Q. This is essentially an inference of the following form:
- X,Y,Z Q
And in general I think this should be generalized to arbitrarily many premises. If we have that, counterfactual reasoning seems secure.
But agglomeration is just a special case of this, where Q=X&Y&Z (more generally, the conjunction of the various consequents). So if you want to vindicate counterfactual reasoning of the style just mentioned, it seems agglomeration is going to be at the heart of it. I think giving some vindication of this pattern is non-negotiable. To be honest though, it’s not absolutely clear that making it logically valid is obviously required. You might instead try to break this apart into a fairly reliable but ampliative inference from A>X, A>Y, A>Z to A>X&Y&Z, and then appeal to this and the premise X&Y&Z Q to reason logically to A>Q. So it’s far from a knock-down argument, but I still reckon it’s on to something. For example, anyone who wants to base a fictionalism on counterfactuals (were the fiction to be true then…) better take an interest in this sort of thing, since on it turns whether we can rely on multi-premise reasoning to preserve truth-according-to-the-fiction.
Jonathan Bennett is one who considers altering the truth clauses in the way just sketched (he calls it the “near miss” proposal–and points out a few tweaks that are needed to ensure e.g. that we don’t get failures of modus ponens). But he advances a second non-Lewisian way of dealing with the above clauses.
The idea is to abandon evaluations of counterfactuals being true or false, and simply assign them degrees of goodness. The degree of goodness of a counterfactual “A>B” is equal to the proportion of the closest A worlds that are B worlds.
There are at least two readings of this. One is that we ditch the idea of truth-evaluation of counterfactuals conditionals altogether, much as some have suggested we ditch truth-evaluation of indicatives. I take it that Edgington favours something like this, but it’s unclear whether that’s Bennett’s idea. The alternative is that we allow “strict truth” talk for counterfactuals, defined by a strict clause—truth at all the closest worlds—but then think that this strict requirement is never met, and so it’d be pointless to actually evaluate counterfactual utterances by reference to this strict requirement. Rather, we should evaluate them on the sliding scale given by the proportions. Really, this is a kind of error theory—but one supplemented by a substantive and interesting looking account of the assertibility conditions.
Both seem problematic to me. The main issue I have with the idea that we drop truth-talk altogether is the same issues I have with indicative conditionals—I don’t see how to deal with the great variety of embedded contexts in which we find the conditionals—conjunctions, other conditionals, attitude contexts, etc etc. That’s not going to impress someone who already believes in a probabilistic account of indicative conditionals, I guess, since they’ll have ready to hand a bunch of excuses, paraphrases, and tendancies to bite selected bullets. Really, I just don’t think this will wash—but, anyway, we know this debate.
The other thought is to stick with an unaltered Lewisian account, and accept an error theory. At first, that looks like an advance over the previous proposal, since there’s no problem in generalizing the truthconditional story about embedded contexts—we just take over the Lewis account wholesale. Now this is something of an advance of a brute error-theory, since we’ve got some positive guidance about the assertibility conditions for simple counterfactuals—they’re good to the extent that B is true in a high proportion of the closest A-worlds. And that will make paradigmatic ordinary counterfactuals like “Drop>Break” overwhelmingly good.
But really I’m not sure this is much of an advance over the Edgington-style picture. Because even though we’ve got a compositional story about truth-conditions, we don’t as yet have an idea about how to plausibily extend the idea of “degrees of goodness” beyond simple counterfactuals.
As an illustration, consider “If I were to own a china cup, then if I were to drop it out the window, it’d break”. Following simple-mindedly the original recipe in the context of this embedded conditional, we’d look for the proportion of closest owning worlds where the counterfactual “Drop>Break” is true. But because of the error-theoretic nature of the current proposal, at none (or incredibly few) of those worlds would the counterfactual be true. But that’s the wrong result—the conditional is highly assertible. So the simple-minded application of the orginal account goes wrong in this case.
Of course, what you might try to do is to identify the assertibility conditions of “Own>(Drop>Break)” with e.g. “(Own&Drop)>Break”—so reducing the problem of asseribility for this kind of embedding by way of paraphrase to one where the recipe gives plausible. But that’s to adopt the same kind of paraphrase-to-easy-cases strategy that Edgington likes, and if we’re going to have to do that all the time (including in hard cases, like attitude contexts and quantifiers) then I don’t see that a great deal of advance is made by allowing the truth-talk—and I’m just as sceptical as in the Edgington-style case that we’ll actually be able to get enough paraphrases to cover all the data.
There are other, systematic and speculative, approaches you might try. Maybe we should think of non-conditionals as having “degrees of goodness” of 1 or 0, and then quite generally think of the degree of goodness of “A>B” as the expected degree of goodness of B among the closest A-worlds—that is, we look at the closest A-worlds and the degree of goodness of B at each of these, and “average out” to get a single number we can associate with “A>B”. That’d help in the “Own>(Drop>Break)” case—in a sense, instead of looking at the expected truth value of “Drop>Break” among closest Own-worlds, we’d be looking at the expected goodness-value of “Drop>Break” among Own-worlds. (We’d also need to think about how degrees of goodness combine in the case of truth functional compounds of conditionals—and that’s not totally obvious. Jeffrey and Stalnaker have a paper on “Conditionals as Random Variables” which incorporates a proposal something like the above. IIRC, they develop it primarily in connection with indicatives to preserve the equation of conditional probability with the probability of the conditional. That last bit is no part of the ambition here, but in a sense, there’s a similar methodology in play. We’ve got an independent fix for associating degrees with simple conditionals—not the conditional subjective probability as in the indicative case—rather, the degree is fixed by the proportion of closest antecedent worlds where the (non-conditional) consequent holds. In any case, that’s where I’d start looking if I wanted to pursue this line).
Is this sort of idea best combined with the Edgington style “drop truth” line or the error-theoretic evaluation of conditionals? Neither, it seems to me. Just as previously, the compositional semantics based on “truth” seems to do no work at all—the truth value of compounds of conditionals will be simply irrelevant to their degrees of goodness. So it seems like a wheel spinning idly to postulate truth-values as well as these “Degrees of goodness”. But also, it doesn’t seem to me that the proposal fits very well with the spirit of Edgington’s “drop truth” line. For while we’re not running a compositional semantics on truth and falsity, we are running something that looks for all the world like a compositional semantics on degrees of goodness. Indeed, it’s pretty tempting to think of these “degrees of goodness” as degrees of truth—and think that what we’ve really done is replace binary truth-evaluation of counterfactuals with a certain style of degree-theoretic evaluation of them.
So I reckon that there are three reasonably stable approaches. (1) The Lewis-style approach where freaky worlds are further away then they’d otherwise be on account of their freakiness—where the Lewis-logic is maintained and ordinary counterfactuals are true in the familiar sense. (2) The “near miss” approach where logic is revised, ordinary counterfactuals are true in the familiar sense. (3) Then there’s the “degree of goodness” approach—which people might be tempted to think of in the guise of an error theory, or as an extension of the Adams/Edgington-style “no truth value” treatment of indicatives—but which I think will have to end up being something like a degree-theoretic semantics for conditionals, albeit of a somewhat unfamiliar sort.
I suggested earlier that an advantage of the Lewis approach over the “near miss” approach was that agglomeration formed a central part of inferential practice with conditionals. I think this is also an advantage that the Lewis account has over the degree-theoretic approach. How exactly to make this case isn’t clear, since it isn’t altogether obvious what the *logic* of the degree theoretic setting should be—but the crucial point is “A>X1″… “A>Xn” can all be good to a very high degree, while “A>X1&…&Xn” are good to a very low degree. Unless we restrict ourselves to starting points which are good to degree 1, then we’ll have to be wary of degradation of degree of goodness while reasoning under counterfactual suppositions, just as on the near miss proposal we’d have to be wary of degradation from truth to faslity. So the Lewisian approach I favour is, I think, the only one of the approaches currently on the table which makes classical reasoning under counterfactual suppositions fully secure.