Category Archives: Counterfactuals

Barker on worlds-semantics for counterfactuals

In a forthcoming paper in Nous, Stephen Barker argues that embedding phenomena mean we should give up the ambition to give an account of the truth-conditions of counterfactuals in terms of possible worlds. Barker identifies an interesting puzzle: but it is obscured by the overly strong claims he makes for it.

Let’s suppose that we have a shuttered window, and Jimmy with a bunch of stones, which (wisely) he leaves unthrown). The counterfactual “Had Jimmy thrown the stone at the window, it would have hit the shutters” is true. But it isn’t necessarily true. For example, if the shutters had been opened a few seconds ago, then had Jimmy thrown the stone, it would have sailed through. Let’s write the first counterfactual as (THROW>SHUTTERS), and the claim we’ve just made as (OPEN>(THROW>SAILEDTHROUGH)). This is a true counterfactual with a counterfactual as consequent, capturing one way in which the latter is contingent on circumstance.

Here is Barker’s puzzle, as it impacts David Lewis’s account of counterfactuals. On Lewis’s account (roughly) a counterfactual is true iff all the closest worlds making the antecedent true also make the conclusion true. Therefore for the double counterfactual above to be true, (THROW>SAILEDTHROUGH) must be true at all the closest OPEN-worlds.

But Lewis also told us something about what makes for closeness of worlds. Cutting a long story short, for the “forward tracking” cases of counterfactuals, we can expect the closest OPEN worlds to exactly match the actual world up to a few seconds ago, whereupon some localized event occurs which is inconsistent with the laws of the actual world, which (deterministically) leads to OPEN obtaining. After this “small miracle”, the evolution of the world continues in accordance. Let’s pick a representative such world, W. (For later reference, note that both THROW and SAILEDTHROUGH are false in W—it’s just a world where the shutters are opened; but Jimmy’s stone doesn’t move).

Barker asks a very good question: is THROW>SAILEDTHROUGH true at W, on Lewis’s view? For this to be so, we need to look at the THROW world nearest to W. What would such a world be? Well, as before, we look for a world that exactly matches W up until shortly before THROW is supposed to occur—which then diverges from W by a small miracle (by the laws of nature in W) in such a way as to bring about THROW, and which from then on evolves in accordance with W’s laws.

But what are W’s laws? Not the actual laws—W violates those. Maybe it has no laws? But then all sorts of crazy evolutions will be treated as “consistent with the laws of W”, and so we’d have no right to assume that in all such evolutions SAILEDTHROUGH would be true. Maybe it has all of W’s laws except the specific one that we needed to violate? But still, if e.g. the actual law tying gravitational force to masses and square separation is violated at W, then removing this from the books allows in all sorts of crazy trajectories correlated with tailored gravitational forces—and again we have no right to think that on all legal evolutions SAILEDTHROUGH will be true. Net result, suggests Barker: we should assume absent further argument that THROW>SAILEDTHROUGH is false at W; and hence OPEN>(THROW>SAILEDTHROUGH) is false. The same recipe can be used to argue for the falsity of all sorts of intuitively true doubly-embedded counterfactuals.

I’ve presented Lewis’s theory very loosely. But the problem survives translation into his more precise framework. The underlying trouble that Barker has identified is this: Lewis’s account of closeness relies on two aspects of similarity to the relevant “base” world: matching the distribution of properties in the base world, and fitting the laws of the base world. But because the “counterfactually selected” worlds violate actual laws, there’s a real question-mark over whether the second respect of similarity has any teeth, when taken with respect to the counterfactually selected world, instead of actuality

This is a nice puzzle for one very specific theory. But does it really show that the worlds approach is doomed? I think that’s very far from the case. Note first that giving a closest-worlds semantics is separable from giving an analysis of closeness in terms of similarity; let alone the kind of analysis that Lewis favoured. So Barker’s problem simply won’t arise for many worlds-theorists. So, I think, Barker’s point is best presented as a problem for one who buys the whole Lewisian package. My second observation is that Lewis himself has the resources to avoid Barker’s case, thanks to his Humean theory of laws. And if we’ve gone so far as to buy the whole Lewisian package on counterfactuals, it won’t be surprising if we’ve gotten ourselves committed to some of the other Lewisian views.

The simple observation that lies behind the first point is that the bare closeness-semantics for counterfactuals does not get us anywhere near talk of miracles, violations of law and the rest. Indeed, to generate the logic of counterfactuals (and thus get predictions about the coherence of various combinations of beliefs with counterfactual content) we do not even need to appeal to the notion of an “intended interpretation” of closeness—we could treat it purely algebraically. If we do believe that one interpretation gives the actual truth conditions of counterfactuals, there’s nothing in principle to stop us treating closeness as primitive (Stalnaker, for example, seems to adopt this methodological stance) or even to give an explicit definition of closeness in counterfactual terms. Insofar as you wanted a reduction of counterfactuality to something else, this’d be disappointing. But whether such reduction is even possible is contentious; and even if it is, it’s not clear that we should expect to read it off our semantics.

So the algebraists, the primitivists, the reverse analyzers can all buy into worlds-semantics for counterfactuals without endorsing anything so controversial as Lewis’s talk of miracles. Likewise, it’s really not clear that anyone going in for a reduction of closeness to something else needs to follow Lewis. Lewis’s project is constrained by all sorts of extrinsic factors. For example, it’s designed to avoid appeal to de jure temporal asymmetries; it’s designed to avoid mention of causation, and so forth and so on. Connectedly, the laws Lewis considers are micro-laws, allowing him to focus on the case of determinism as the paradigm. But what about the (indeterministic) laws of statistical mechanics? If fit with the probabilistic laws of statistical mechanics as well as fit with determinsitic laws play a role in determining closeness, the game changes quite markedly. So there are all sorts of resources for the friend of illuminating reductions of closeness–it’d be a positive surprise if they ended up using only the same sparse resources Lewis felt forced to.

Can we get a version of the dilemma up and running even without Lewis’s particular views? Well, here’s a general thought. Supposing that the actual micro-laws are deterministic, every non-duplicate of actuality is either a universal-past-changer (compared to actuality) or is a law-breaker (with respect to actuality). For if the laws are deterministic, if worlds W and @ match on any temporal segment, they’ll match simpliciter. Now consider the “closest OPEN worlds to actuality” as above. We can see a priori that either this set contains past changers, or law breakers (or possibly both). Lewis, of course, set things up so the latter possibility is realized, and this is what led to Barker’s worries. But if universal-past-changers can’t be among the closest OPEN-worlds, then we’ll be forced to some sort of law-breaking conception.

What’s wrong with past-changing, then? Well, there are some famous puzzle cases if the past is wildly different from the present. Bennett’s “logical links” worry, for example, concerns counterfactuals like the following: “If I’d’ve been fitter, I’d have made it to the top of the hill where the romans built their fort many years ago”. Here, the intuitively true counterfactual involves a consequent containing a definite description, and the description singles out an individual via a past-property. If we couldn’t presuppose that in the counterfactually nearest world, individuals now around retained their actual past properties, these kinds of counterfactuals would be dodgy. But they’re totally smooth. And it’s pretty easy to see that the pattern generalizes. Given a true counterfactual “If p, then a is G”, and some truth about the past Q, we construct a definite description for a that alludes to Q (e.g. “the hill which is located a mile from where, 1000 years ago, two atoms collided on such-and-such trajectories”). The logical links pattern seems to me the strongest case we have against past-changing. If so, then independently of a Lewisian analysis of closeness, we can see that the closest worlds will be lawbreakers not past-changers.

Two things to note here. Universal-past-changers need not be Macro-past-changers. For all we’ve said, there past-changing worlds in the closest OPEN-set might coincide with actuality when it comes to the distribution of thermodynamic properties over small regions of spacetime (varying only on the exact locations etc of the microparticles). For all we’ve said, there may be worlds with same macro-past as actuality in the closest set that are entirely legal. If we have to give up logical-links style cases, but only where the descriptions involve microphysics, then that’s a surprise but perhaps not too big a cost—ordinary counterfactuals (including those with logical links to the macro-past) would come out ok. I’m not sure I’d like to assert that there are such past-changing macro-past duplicators around to play this role; but I don’t see that philosophers are in a position to tell from the armchair that there aren’t any—which is bad news for an a prioristic argument that the worlds-semantics is hopeless. Second, even if we could establish that lawbreaking is the only way to go, that doesn’t yet give enough to give rise to Barker’s argument. Once we have that W (our representative closest OPEN world) is a lawbreaker, the argument proceeds by drawing out the consequences of this for the closest THROW worlds to W. And that step of the argument just can’t be reconstructed, as far as I can see, without appeal to Lewis’s analysis itself. The primitivists, reverse-analysists, alternative-analysists and the like are all still in the game at this point.

So I think that Barker’s argument really can’t plausibly be seen as targetting the worlds-semantics as such. Indeed, I can’t see that it has any dialectical force against Stalnaker, who to say the least looms large in this literature.

But what if one insisted that the target of the criticism is not these unspecified clouds of worlds-theorists, but Lewis himself, or at least those who (perhaps unreflectively) go along with Lewis? To narrow the focus in this way would mean we have to cut out much of the methodological morals that Barker attempts to draw from his case. But it’s still an interesting conclusion—after all, many people do hand wave towards the Lewisian account, and if it’s in trouble, perhaps philosophers will awaken from their dogmatic slumbers and realize the true virtues of the pragmatic metalinguistic account, or whatever else Barker wishes to sell us.
I think this really is where we should be having the debate; the case does seem particularly difficult for Lewis. What I want to argue, however, is that the fallback options Barker argues against, though they may convince others, have little force against Lewis and the Lewisians.

The first fallback option that Barker considers is minimal law-modification. This is where we postulate that W has laws that look very similar to those of the actual world—except they allow for a one-off, specified exception. Suppose that the only violation of @’s laws in W occurs localized in a tiny region R. Then if some universal generalization (for all x,y,z, P) is a law of @, the corresponding W-law will be: (for all x,y,z that aren’t in region R, P). If we want to get more specific, we could add a conjunct saying what *should* happen in region R.

Barker rightly reminds us of Lewis’s Humeanism about laws—what it is to be a law in a world is to be a theorem of the best (optimally simple/informative) system of truths in that world. In the case at hand, it seems that the sort of hedged generalizations just given will be part of such a best system—what’s the alternative? The original non-hedged generalization isn’t even true, so on the official account isn’t in the running to be a law. the hedged generalization is admittedly a little less simple than one might like—though not by much—and leaving it out gives up so much information that it’s hard to see how the system of axioms without the hedged generalization could end up beating a system including the hedged axiom for “best system”. Whether to go for the one that is silent about the behaviour in R, or one that spells it out, turns on delicate issues about whether increased complexity is worth the increased strength that I don’t think we have to deal with here (if I had to bet, I’d go for the silence over specification). Lewis’s account, I think, predicts that minimal mutilation laws are the laws of W. If this is right, then Barker’s argument against this fallback option is inter alia an objection to Lewis’s Humean account of laws. So what is his objection?

Barker’s objection to such minimal mutilation laws is that they don’t fit with “the basic explanatory norms of physical science”. Either we have no explanation for why regularity has a kink at region R, or we have an explanation, but it alludes to region R itself. But if there’s “no explanation” then our law isn’t a real law; and if we have to allude to region R, we are assigning physical (explanatory) significance to particularity, which goes against practice in physical science.

As a Humean, Lewis is up to his ears in complaints from the legal realists that his laws don’t “explain” their instances. So complaints that the laws of W aren’t “real” because they’re not “explanatory” should send up danger-signals. Now, on the option where we build in what happens at R into the W-laws, we at least have a minimal kind of explanation—provability from a digested summary of what actually happens. That’s not a terribly interesting form of explanation, but it’s all we have even in the best case. If the W-laws simply leave a gap, we don’t even have that (though of course, we do have the minimal “explanation” for every other region of space). But who’s worried? It’s no part of Lewis’s conception of laws that they explain things in this sense. Indeed, if you look through his overall programme of vindicating Humean supervenience (exactly the programme, by the way, that motivates the severe constraints on the analysis of counterfactuals) then you’ll see that the theoretical role for laws is essentially exhausted by fixing the truth-conditions of standard counterfactuals—-of course, counterfactuals are used all over the place (not least in articulating Lewis’s favoured view of what it is for one proposition to explain another). And the W-laws seem to play that job pretty well. Another way to put this: for Lewis’s purposes of analyzing closeness, verbal disputes about what is or isn’t a “law” aren’t to the point. Call them Lewis-laws, if you like, stipulatively defined via the best system analysis. If Lewis-laws, Lewis-miracles and the rest successfully analyze closeness, then (assuming that the rest of Lewis’s account goes through) we can analyze closeness ultimately in terms of the Humean supervenience base.

So I think Lewis (and other Humeans) wouldn’t find much to move him in Barker’s account. And—to repeat the dialectical point from earlier—if we’re not Humeans, it’s not clear what the motivation is for trying to analyze closeness in the ultra-sparse way Lewis favours.

But the points just made aren’t entirely local to Lewis and the Lewisians. Suppose we bought into a strong theses that laws must explain their instances, and on this basis both reject Humeanism, and agree with Barker that the hedged generalizations aren’t laws in W. Lewis’s analysis of counterfactual closeness now looks problematic, for exactly the reasons Barker articulates. One reaction, however, is simply to make a uniform substitution in the analysis. Whereever Lewis wrote “law”, we now write “Lewis-law”. Lewis-laws aren’t always laws, we’ll now say, since they don’t explain their instances. Indeed, the explanation for why we have the regularities constitutive of Lewis-laws in the actual is because the real laws are there, pushing and pulling things into order. But Lewis-laws, rather than genuine laws, are what’s needed to analyze closeness, as the Barker cases show us.

To conclude. We should distinguish the general family of worlds-semantics, from the particular analytic project that Lewis was engaged in. Among the family are many different projects, of varying ambition. Barker’s arguments are particular to Lewis’s own proposal. And once we’re dealing with Lewis himself, Barker’s “fallback” position of minimal mutilation laws turns out to be predicted from Lewis’s own Humeanism—and Barker’s objections are simply question-begging against the Humean. Finally, since even a non-Humean can avail themselves of “Lewis laws” for theoretical purposes whenever they prove useful, there is an obvious variant of the Lewis analysis that, for all that’s been said, we can all agree to. Worlds accounts of counterfactuals, in all their glorious variety, are alive and well.

Subject-relative safety and nested safety.

The paper I was going to post took off from very interesting recent work by John Hawthorne and Maria Lasonen that creates trouble from the interaction of safety constraints and a plausible looking principle about chance and close possibilities. The major moving part is a principle that tells you (roughly) that whenever a proposition is high-chance at w,t, then some world compatible with the proposition is  a member of the “safety set” relevant to any subject’s knowledge at w,t (the HCCP principle).

It’s definitely worth checking out Williamson’s reply to H&L. There’s lots of good stuff in it. Two relevant considerations: he formulates a version of safety in the paper that is subject-relativized (one of the “outs” in the argument that H&L identify, and defends this against the criticisms they offer). And he rejects the HCCP principle. The basic idea is this: take some high-but-not-1-chance proposition that’s intuitively known e.g. the ball is about to hit the floor. And consider a world in which this scenario is duplicated many times—enough so that the generalization “some ball fails to hit the floor” is high-chance (though false). Each individual conjunct seems epistemically on a par with the original. But by HCCP, there’s some failure-to-hit world in the safety set, which means at least one of the conjuncts is unsafe and so not known.

Rejecting HCCP is certainly sufficient to get around the argument as stated. But H&L explicitly mention subject-relativization of safety sets as a different kind of response, *compatible* with retaining HCCP. The idea I take it is that if safety sets (at a given time) can vary,  *different* “some ball hitting floor” possibilities could be added to the different safety sets, satisfying HCCP but not necessarily destroying any of the distributed knowledge claims.

I see the formal idea, which is kind of neat. The trouble I have with this is that I’ve got very little grip at all as to *how* subject-relativization would get us out of the H&L trouble. How can particular facts about subjects change what’s in the safety set?

I’m going to assume the safety set (for a subject, at a given time and place) is always a Lewisian similarity sphere—that is, for some formal similarity ordering of worlds, the safety sphere is closed downwards under “similarity to actuality”.  I’ll also assume that *similarity* isn’t subject-relative, though for all I’ll say it could vary e.g. with time. The assumptions are met by Lewis’s accout of counterfactual similarity—in fact, for him similarity isn’t time-relative either—but many other theories can also agree with this.

The assumption that the safety set is always a similarity sphere (in the minimal sense) seems a pretty reasonable requirement, if we’re to justify the gloss of a safety set as a set of the “sufficiently close worlds”.

But just given the minimal gloss, we can get some strong results: in particular, that safety sets for different subjects at a single time will be nested in one another (think of them as “spheres around actuality”–given minimal formal constraints, Lewis articulates, the “spheres” are nested, as the name suggests).

Suppose we have n subjects in an H&L putative “distributed knowledge” case as described earlier. Now take the minimal safety set M among those n subjects. This exists and is a subset of the safety sets of all the others, by nesting. And by HCCP, it has to include a failure-to-hit possibility within it. Say the possibility that’s included in M features ball k failing to hit. But this means that that possibility is also in the safety set relevant to the kth person’s belief that their ball *will* hit the ground, and so their actual belief is unsafe and can’t count as knowledge—exactly the situation that relativizing to subjects was supposed to save us from!

The trouble is, the sort of rescue of distributed knowledge sketched earlier relies on the thought that safety sets for subjects at a time might be “petal shaped”—overlapping, but not nested in one another. But thinking of them as “similarity spheres”, where similarity is not subject relative, simply doesn’t allow this.

Now, this doesn’t close off this line of inquiry. Perhaps we *should* make similarity itself relative to subjects or locations (if so, then we definitely can’t use Lewis’s “Time’s arrow” sense of similarity). Or maybe we could relax the formal restrictions on similarity that allow us to derive nesting (If worlds can be incomparable in terms of closeness to actuality, we get failures of nesting—weakening Lewis’s formal assumptions in this way weakens the associated logic of counterfactuals to Pollock’s SS). But I do think that it’s interesting that the kind of subject-relativity of closeness that might be motivated by e.g. interest-relative invariantism about knowledge (the idea that how “close” the worlds to be in the safety set  depends on the interests etc of the knower) simply don’t do enough to get us out of the H&L worries.  We need a much more thorough-going relativization if we’re going to make progress here.

Safety and lawbreaking

One upshot of taking the line on the scattered match case I discussed below is the following: if @ is deterministic, then legal worlds (aside from @) are really far away, on grounds of utterly flunking the “pefect match” criterion utterly. If perfect match, as I suggested, means “perfect match over a temporal segment of the world”, then legal worlds just never score on this grounds at all.

Here’s one implication of this. Take a probability distribution compatible with determinism—like the chances of statistical mechanics. I’m thinking of this as a measure over some kind of configuration space—the space of nomically poossible worlds. So subsets of this space correspond to propositions that (if we choose them right) have high probability, given the macro-state of the world at the present time. And we can equally consider the conditional probability of those on x pushing the nuclear button. For many choices of P which have high probability conditionally on button-pressing, “button-pressing>~P” will be true. The closest worlds where the button-pressing happens are going to be law-breaking worlds, not legal worlds. So any proposition only true at legal worlds will not obtain, given the counterfactual. But sets of such worlds can of course get high conditional probability.

There’s an analogue of this result that connects to recent work on safety by Hawthorne and Lasonen-Aarnio. First, presume that the safety set at w,t  (roughly set of worlds such we musn’t believe falsely that p, if we are to have knowledge that p) is a similarity sphere in Lewis’s sense. That is: any world counterfactually as close as a world in the set must be in the set. If any legal world is in the set, all worlds with at least some perfect match will also be in that set, by the conditions for closeness previously mentioned. But that would be crazy—e.g. there are worlds where I falsely believe that I’m sitting in front of my computer, on the same base as I do now, which have *some* perfect match with actuality in the far distant past (we can set up mad scientists etc to achieve this with only a small departure from actuality a few hundred years ago). So if the safety set is a similarity sphere, and the perfect match constraint is taken as I urged, then there better not be any legal worlds in the safety set.

What this means is that a fairly plausible principle has to go:  that if, at w and t, P is high probability, then there must be at least one P-world in the safety set at w and t. For as noted earlier, law-entailing propositions can be high-probability. But massive scepticisim results if they’re included in the safety set. (I should note that Hawthorne and Lasonen don’t endorse this principle, but only the analogous one where the “probabilities” are fundamental objective chances in an indeterministic world—but it’s hard to see what could motivate acceptance of that and non-acceptance of the above).

What to give up? Lewis’s lawbreaking account of closeness? The safety set as a similarity sphere? The probability-safety connection? The safety constraint on knowledge? Or some kind of reformulation of one of the above to make them all play nicely together. I’m presently undecided….

Counterfactuals and the scattered match case

One version of Lewis’s worlds-semantics for counterfactuals can be put like this: “If were A, then B” is true at @ iff all the most similar A-worlds to @ are B-worlds. But what notion of similarity is in play? Not all-in overall approximate similarity, otherwise (as Fine pointed out) a world in which Nixon pressed the button, but it was quickly covered up, and things at the macro-level approximately resemble actuality from then on, would count as more similar to @ than worlds where he pressed the button and events took their expected course: international crisis, bombings, etc. Feed that into the clause for conditionals and you get false counterfactuals coming out true: e.g. “If Nixon had pressed the button, everything would be pretty much the way it actually is”.

In “Time’s arrow”, Lewis proposed a system of weightings for the “standard ordering” of counterfactual closeness. They’re intended to apply only in cases where the laws of nature of @ are deterministic. Roughly stated, worlds are ordered around @ by the following principles:

  1. It is of the first importance to avoid big, widespread violations of @’s laws
  2. It is of the second importance to maximize region of exact intrinsic match to @ in matters of particular fact
  3. It is of the third importance to avoid even small violations of @’s laws
  4. It is of little or no importance to maximize approximate similarity to @

These, he argued, gave the right verdict on the Nixon-counterfactuals. For Nixon counterfactuals only have approximate perfect match, which counts for little or nothing. The most similar button-pushing worlds by the above lights, said Lewis, would be worlds that perfectly matched @ up to a time shortly before the button-pressing, diverged by a small law-violation, and then events ran on wherever the laws of nature took them—presumably to international crisis, nuclear war, or whatever. Such worlds are optimal as regards (1), ok as regards (2) (because of the past match). And they’re ok as regards (3) (only one violation of law needed). (Let’s suppose that approximate convergence has no weight—it’ll make life easier). Pick one such world and call it NIX.

If this is to work, it better be that no “approximate future convergence” world does better by this system of weights than NIX. It’d be pretty easy to beat NIX on grounds (3)—just choose any nomically possible world and you get this. But the key issue is (2), which trumps such considerations. Are there approximate future convergence worlds that match or beat NIX on this front?

Lewis thought there wouldn’t be. NIX already secures perfect match up until the 70’s. So what we’d need is perfect convergence in the future (after the button pressing). But Lewis thought to do this, we’d have to invoke many many violations of law, to wipe out the traces of the button-pushing (set the lightwaves back on their original course, as it were). We’d need a big and diverse miracle to get perfect future match. But such worlds are worse than NIX by point (1), which is of overriding importance.

Now *some* miracle would be needed if we’re to get perfectly match at some future time-segment. Here’s the intuitive thought. Suppose A is the button-pushing world that perfectly matches @ at some future time T.  Run the laws of nature backwards from T. If the laws are deterministic, you’ll get exact match of all times prior to T until you get some violation of law. But the button-pushing happens in @ and not in A, so they can’t be duplicates then. So there must be some miracle that happens in between T and the button-pressing.

First thought. The doesn’t yet make the case that for the reconvergence to happen, we need lots of violations all over the place. Why couldn’t there be worlds where a tiny miracle at a suitable “pressure point” effects global reconvergence?

Rejoinder: one trouble with this idea is that presumably (as Lewis notes) the knock-on-effects of the first divergence spread quickly. In the few moments it takes to get Nixon to press the button, the divergences from actuality are presumably covering a good distance (consider those light-waves!). So how could a single *local* miracle possibly undo this effect? If a beam of light is racing away from the first event, that wouldn’t otherwise be there, then changes resulting from the second (small, local) miracle aren’t going to “catch it up”. There are probably some major assumptions about locality of causation etc packed in here. But it does seem like Lewis is pretty well-justified in the claim that it’d take a big, widespread miracle to reconverge.

Second thought. Consider a world that, like NIX, diverges from actuality just at the button-pressing moment. Let it never perfectly match @ again, and let it contain no more miracles. In that case, it looks like (so far as we’ve said) it *exactly ties* with NIX for closeness. But now: couldn’t one such world have approximate match to @ in the future? That would require some *deterministic* progress from button-pushing to (somehow) the nuclear launch not happening, and a lot of (deterministic) coverup. A big ask. But to say that there is just no world meeting this description seems an equally big commitment.

Rejoinder. I’m not sure how Lewis should respond to this one. He mentions very plausible cases where slight differences would add up: slight changes of tone in the biography, influencing readers differently, changing their lives, etc. It’s very very plausible that such stuff happens. But is it *nomically impossible* that approximate similarity be maintained? I just don’t see the case here.

(A note at what’s at stake here. Unlike perfect reconvergence, if Lewis allowed such approximate reconvergence worlds, you wouldn’t get “If Nixon had pressed the button, things would be approximately the same” coming out true. For the most we’d get is that these approximate coverup worlds are as close as NIX. NIX ensures that counterfactuals like the above are false—approximate similarity wouldn’t ensue at all most similar button-pushing worlds. But the approximate convergence world would equally ensure the falsity of ordinary counterfactuals, e.g. “If Nixon had pressed the button, things would be very different”. More generally, the presence of such approximate reconvergence worlds would make lots of ordinary counterfactuals false.)

Third thought. Lewis raises the possibility of entirely legal worlds that resemble @ in the 1970’s, but feature Nixon pressing the button. As Lewis emphasizes, there can’t be perfectly match with temporal slices of @ at any time, if they involve no violation of deterministic law. Lewis really has two things to say about such worlds. First, he says there’s “no guarantee” that there’s any such world will even approximately resemble @ in the far distinct future or past. He says: “it is hard to imagine how two deterministic worlds anything like ours could possibly remain only a little bit different for very long. There are altogether too many opportunities for little differences to give to big differences”. But second, given the four-part analysis given above, such worlds, these worlds aren’t going to be good contenders for similarity, since e.g. they’ll never perfectly match @ at any time.

Let’s suppose Lewis is wrong on (1): that there are nomic possibilities approximately like ours throughout history, except for the Button Pushing. I’m not sure what exactly the case against these worlds being close is on the four-part analysis. Sure, NIX has perfect match throughout the whole of history up till the 1970’s. And the worlds just discussed don’t have that. But condition (2) just says that we have to maximize the region of perfect match—and maybe there are other ways to do that.

One idea is that worlds like these could earn credit by the lights of (2), by having large but scattered match with @. Suppose there’s a button-pushing world W, with perfect match before the button-pushing, and such that post-pressing, there are infinitely many centimetre-cubed by 1 second regions of space-time, at which the trajectories and properties of particles *within that region* exactly match those in the corresponding region of @. You might well think that in a putative case of approximate match (including approximate match of futures) there’d be lots of opportunities for this kind of short-lived, spatially limited coincidence.

So how does (2) handle these cases? It’s just not clear—it depends on what “maximizing the region of perfect match” means. Maybe we’re supposed to look at the sheer volume of the regions where there is perfect fit. But that’ll do no good if the volumes are each infinite. In a world with infinite past and infinite future, exact match from the 1970’s back “all the way” doesn’t have a greater volume than the sum of infinitely many scattered regions, if both volumes are infinite. In a world with finite past but infinite future, continued sparse scattered future match could have *infinite* volume, as opposed to the finite volume of perfect match secured for NIX.

This causes problems even without the reconvergence. We want “button pressing” worlds not to diverge too early. Divergence in the 1950’s, with things being very different from then on, ultimately ending with a Soviet stooge Nixon pressing the button, is not the kind of most-similar world we want. (2) is naturally thought to help us out—maximizing perfect match is supposed to pressure us to put the divergence event as late as possible. But if we look only at the relative volumes of perfect match, in cases of an infinite past, the volumes of perfect match will be the same. This suggests we look, not at volumes, but at subregionhood. w will be closer to @ than u (all else equal) if the region through which w perfectly matches @ is a proper superregion of that through which u perfectly matches @. But this won’t promote NIX over scattered perfect match worlds—since in neither case do the regions of perfect match completely overlap the other’s.

Perhaps there are more options. One thought is to look at something like the ratio of volume of regions of perfect match to the volume of regions of non-perfect match at each time. Scattered match clearly goes with a low density of perfect match at times, in this sense—whereas in NIX the density at a time will be 1. How to work this into a proposal for understanding the imperative “maximize perfect match!” I don’t know.

Unless we say *something* to rule out scattered perfect match worlds, then prima facie they could match the extent of match in NIX. But then, because they never violate the laws, but NIX does (albeit once), they beat out NIX on (3). So this case (unlike approximate future match given above) we’re back to a situation where there’s a danger of declaring the “future similarity” counterfactual true, as well as the ordinary counterfactuals false.

Let’s review the three cases. First, there was the possibility of getting exact reconvergence to @ at future time T, via a single miracle. Second, there was the possibility of approximate future similarity without any perfect similarity. Third, there was the possibility of approximate overall match throughout time, with local, scattered, perfect match.

In effect, Lewis in Time’s Arrow doubts whether there are possibilities matching any of these descriptions. I thought that we could give some prima facie substance to that doubt in the first case. In the other two, I can’t see what the principled position is other than agnosticism, as yet. Lewis says, for example, about the third kind of case, that it’s “hard to imagine” how two worlds could approximately resemble each other in this way, and that there’s “no guarantee” that they’ll be like this. But is this good enough? Lots of things about nomic space are hard to imagine. Have we any positive reasons for doubt that possibilities of type 3 exist? Personally, in the absence of evidence, I’ll go 50/50 on whether they exist. But that’s to go 50/50 on whether Lewis’s favoured account makes most ordinary counterfactuals false. Not a good result.

I do have one positive suggestion, that’ll fix up the third case. Again, it comes down to what we’re trying to maximize in maximizing regions of perfect fit. The proposal is that we insist on complete temporal slices perfectly matching @, before we count them towards closeness as outlined in (2). That is, (2) should be understood as saying: maximize the *temporal segment* in which you have perfect fit. Now we can appeal to determinism to show that legal worlds will *never* perfectly match with @ at any time—and so *automatically* flunk (2) to the highest possible degree.

So the state of play seems to me this. It seems to me that there are plausible grounds for having low credence in the first worry with the account. And precisifing “perfect match” in the way just suggested deals with the third one. That only leaves the second worry—perfect past match+small violation+approximate future match.

I do want to emphasize one thing here. It is significant that the remaining problem, unlike the others, doesn’t make the offending “future similarity” counterfactual *true*. Those objections, had they been successful, would have promised the result that *all* the most similar worlds have futures like ours, rather than like NIX. But all we get with the residual objection, if it’s successful, is that *some* of the most similar worlds are of the offending type—for all we’ve said, *most* of the most similar worlds would be like NIX.

This brings into play other tweaks to the setting. Some (like Bennett) want for indepedendent reasons to change Lewis’s truth-conditions from “B is true at all the closest A worlds” to “B is true at most/the vast majority of the closest A worlds”. One could make this move against the current worry, but not against the other two.

I’m not a particular fan of the revisions to the logic of counterfactuals this suggestion would induce. There’s another thought I’m more sympathetic to. That’s to go Stalnakerian on the truth conditions, viewing what Lewis thinks of as “ties for closeness” as cases of indeterminacy in a total ordering. If so, what we’d get from the above is that at most that counterfactuals like “If Nixon had pressed the button, things would have been very different” are indeterminate (because false on at least one precisification of the ordering).

It’s not clear to me that this is a bad result. It depends very much on the “cognitive role of indeterminacy” that I’ve talked about ad nauseum before on this blog. If one can perfectly rationally be arbitrarily highly confident of indeterminate propositions, then no revision to our ordinary credences in ordinary counterfactuals need be induced by admitting them to be indeterminate. If, on the other hand, you take a “rejectionist” view of indeterminacy where it acts a bit like presupposition failures, this option is no more comfortable than admitting that most counterfactuals are false.

Anyway, just to emphasize: if these options are even going to be runners, we’re going to have to do something about the scattered match case.

Conditionals in Budapest

This event looks fabulous—over a week on conditionals in the company of Stanley, Loewer, Edgington, Hajek, Kratzer, and Stalnaker.

I’m actually due to be in Australia during July, but if I were in Europe, I’d be there.

Chancy counterfactuals—three options

I was chatting to Rich Woodward earlier today about Jonathan Bennett‘s attitude to counterfactuals about chancy events. I thought I’d put down some of the thoughts I had arising from that conversation.

The basic thought is this. Suppose that on conditional that A were to happen, it would be overwhelmingly likely that B—but not probability 1 that B would occur. Take some cup I’m holding—if I were to drop it out the window, it’s overwhelmingly likely that it would fall to the floor and break, rather than shoot off sideways or quantum tunnel through the ground. But (we can suppose) there’s a non-zero—albeit miniscule—chance that the latter things would happen. (You don’t need to go all quantum to get this result—as Adam Elga and Barry Loewer have emphasized recently, if we have counterfactuals about macroevents, the probabilities involved in statistical mechanics also attribute tiny but nonzero probability to similarly odd things happening).

The question is, how should we evaluate the counterfactual “Drop>Break” taking into account the fact that given that Drop, there’d be a non-zero but tiny chance that ~Break?

Let’s take as our starting point a Lewisian account of of the counterfactual—“A>B” is to be true (at w) iff B is true at all the closest A-worlds to B. Then the worry many people have is that though the vast majority of closest possible Drop-worlds will be Break worlds, there’ll be a residual tiny minority of worlds where it won’t break—where quantum tunnelling or freaky statistical mechanical possibilities are realized. But since Lewis’s truth-conditions require that Break be true at *all* the closest Drop-worlds, even that tiny minority suffices to make the counterfactual “Drop>Break” false.

As goes “Drop>Break”, so goes almost every ordinary counterfactual you can think of. Almost every counterfactual would be false, if the sketch just given is right. Some people think that’s the right result. We’ll come back to it below.

Lewis’s own response is to deny that the freaky worlds are among the closest worlds. His idea is that freakiness (or as he calls it, the presence of “quasi-miracles”) itself is one of the factors that pushes worlds further away from actuality. That’s been recently criticised by John Hawthorne among others. I’m about to be in print defending a generally Lewisian line on these matters—though the details are different from Lewis’s and (I hope) less susceptible to counterexample.

But if you didn’t take that line, what should you say about the case? A tempting line of thought is to alter Lewis’s clause—requiring not truth at all the closest worlds but truth at most, or the overwhelming majority of them. (Of course, this idea presumes it makes sense to talk of relative proportions of worlds—let’s spot ourselves that).

This has a marked effect on the logic of counterfactuals—in particular, the agglomeration rule (A>B, A>C, therefore A>B&C) would have to go (Hawthorne points this out in his discussion, IIRC). To see how this could happen, suppose that there are 3 closest A-worlds, and X needs to be true at 2 of them in order for “A>X” to be true. Then let the worlds respectively be B&C, ~B&C, ~C&B-worlds. This produces a countermodel to agglomeration.

Agglomeration strikes me as a bad thing to give up. I’m not sure I have hugely compelling reasons for this, but it seems to me that a big part of the utility of counterfactuals lies in our being able to reason under a counterfactual supposition. Given agglomeration you can start by listing a bunch of counterfactual consequences (X, Y, Z), reason in standard ways (e.g. perhaps X, Y, Z entail Q) and then conclude that, under that counterfactual supposition, Q. This is essentially an inference of the following form:

  1. A>X
  2. A>Y
  3. A>Z
  4. X,Y,Z\models Q

Therefore: A>Q.

And in general I think this should be generalized to arbitrarily many premises. If we have that, counterfactual reasoning seems secure.

But agglomeration is just a special case of this, where Q=X&Y&Z (more generally, the conjunction of the various consequents). So if you want to vindicate counterfactual reasoning of the style just mentioned, it seems agglomeration is going to be at the heart of it. I think giving some vindication of this pattern is non-negotiable. To be honest though, it’s not absolutely clear that making it logically valid is obviously required. You might instead try to break this apart into a fairly reliable but ampliative inference from A>X, A>Y, A>Z to A>X&Y&Z, and then appeal to this and the premise X&Y&Z\models Q to reason logically to A>Q. So it’s far from a knock-down argument, but I still reckon it’s on to something. For example, anyone who wants to base a fictionalism on counterfactuals (were the fiction to be true then…) better take an interest in this sort of thing, since on it turns whether we can rely on multi-premise reasoning to preserve truth-according-to-the-fiction.

Jonathan Bennett is one who considers altering the truth clauses in the way just sketched (he calls it the “near miss” proposal–and points out a few tweaks that are needed to ensure e.g. that we don’t get failures of modus ponens). But he advances a second non-Lewisian way of dealing with the above clauses.

The idea is to abandon evaluations of counterfactuals being true or false, and simply assign them degrees of goodness. The degree of goodness of a counterfactual “A>B” is equal to the proportion of the closest A worlds that are B worlds.

There are at least two readings of this. One is that we ditch the idea of truth-evaluation of counterfactuals conditionals altogether, much as some have suggested we ditch truth-evaluation of indicatives. I take it that Edgington favours something like this, but it’s unclear whether that’s Bennett’s idea. The alternative is that we allow “strict truth” talk for counterfactuals, defined by a strict clause—truth at all the closest worlds—but then think that this strict requirement is never met, and so it’d be pointless to actually evaluate counterfactual utterances by reference to this strict requirement. Rather, we should evaluate them on the sliding scale given by the proportions. Really, this is a kind of error theory—but one supplemented by a substantive and interesting looking account of the assertibility conditions.

Both seem problematic to me. The main issue I have with the idea that we drop truth-talk altogether is the same issues I have with indicative conditionals—I don’t see how to deal with the great variety of embedded contexts in which we find the conditionals—conjunctions, other conditionals, attitude contexts, etc etc. That’s not going to impress someone who already believes in a probabilistic account of indicative conditionals, I guess, since they’ll have ready to hand a bunch of excuses, paraphrases, and tendancies to bite selected bullets. Really, I just don’t think this will wash—but, anyway, we know this debate.

The other thought is to stick with an unaltered Lewisian account, and accept an error theory. At first, that looks like an advance over the previous proposal, since there’s no problem in generalizing the truthconditional story about embedded contexts—we just take over the Lewis account wholesale. Now this is something of an advance of a brute error-theory, since we’ve got some positive guidance about the assertibility conditions for simple counterfactuals—they’re good to the extent that B is true in a high proportion of the closest A-worlds. And that will make paradigmatic ordinary counterfactuals like “Drop>Break” overwhelmingly good.

But really I’m not sure this is much of an advance over the Edgington-style picture. Because even though we’ve got a compositional story about truth-conditions, we don’t as yet have an idea about how to plausibily extend the idea of “degrees of goodness” beyond simple counterfactuals.

As an illustration, consider “If I were to own a china cup, then if I were to drop it out the window, it’d break”. Following simple-mindedly the original recipe in the context of this embedded conditional, we’d look for the proportion of closest owning worlds where the counterfactual “Drop>Break” is true. But because of the error-theoretic nature of the current proposal, at none (or incredibly few) of those worlds would the counterfactual be true. But that’s the wrong result—the conditional is highly assertible. So the simple-minded application of the orginal account goes wrong in this case.

Of course, what you might try to do is to identify the assertibility conditions of “Own>(Drop>Break)” with e.g. “(Own&Drop)>Break”—so reducing the problem of asseribility for this kind of embedding by way of paraphrase to one where the recipe gives plausible. But that’s to adopt the same kind of paraphrase-to-easy-cases strategy that Edgington likes, and if we’re going to have to do that all the time (including in hard cases, like attitude contexts and quantifiers) then I don’t see that a great deal of advance is made by allowing the truth-talk—and I’m just as sceptical as in the Edgington-style case that we’ll actually be able to get enough paraphrases to cover all the data.

There are other, systematic and speculative, approaches you might try. Maybe we should think of non-conditionals as having “degrees of goodness” of 1 or 0, and then quite generally think of the degree of goodness of “A>B” as the expected degree of goodness of B among the closest A-worlds—that is, we look at the closest A-worlds and the degree of goodness of B at each of these, and “average out” to get a single number we can associate with “A>B”. That’d help in the “Own>(Drop>Break)” case—in a sense, instead of looking at the expected truth value of “Drop>Break” among closest Own-worlds, we’d be looking at the expected goodness-value of “Drop>Break” among Own-worlds. (We’d also need to think about how degrees of goodness combine in the case of truth functional compounds of conditionals—and that’s not totally obvious. Jeffrey and Stalnaker have a paper on “Conditionals as Random Variables” which incorporates a proposal something like the above. IIRC, they develop it primarily in connection with indicatives to preserve the equation of conditional probability with the probability of the conditional. That last bit is no part of the ambition here, but in a sense, there’s a similar methodology in play. We’ve got an independent fix for associating degrees with simple conditionals—not the conditional subjective probability as in the indicative case—rather, the degree is fixed by the proportion of closest antecedent worlds where the (non-conditional) consequent holds. In any case, that’s where I’d start looking if I wanted to pursue this line).

Is this sort of idea best combined with the Edgington style “drop truth” line or the error-theoretic evaluation of conditionals? Neither, it seems to me. Just as previously, the compositional semantics based on “truth” seems to do no work at all—the truth value of compounds of conditionals will be simply irrelevant to their degrees of goodness. So it seems like a wheel spinning idly to postulate truth-values as well as these “Degrees of goodness”. But also, it doesn’t seem to me that the proposal fits very well with the spirit of Edgington’s “drop truth” line. For while we’re not running a compositional semantics on truth and falsity, we are running something that looks for all the world like a compositional semantics on degrees of goodness. Indeed, it’s pretty tempting to think of these “degrees of goodness” as degrees of truth—and think that what we’ve really done is replace binary truth-evaluation of counterfactuals with a certain style of degree-theoretic evaluation of them.

So I reckon that there are three reasonably stable approaches. (1) The Lewis-style approach where freaky worlds are further away then they’d otherwise be on account of their freakiness—where the Lewis-logic is maintained and ordinary counterfactuals are true in the familiar sense. (2) The “near miss” approach where logic is revised, ordinary counterfactuals are true in the familiar sense. (3) Then there’s the “degree of goodness” approach—which people might be tempted to think of in the guise of an error theory, or as an extension of the Adams/Edgington-style “no truth value” treatment of indicatives—but which I think will have to end up being something like a degree-theoretic semantics for conditionals, albeit of a somewhat unfamiliar sort.

I suggested earlier that an advantage of the Lewis approach over the “near miss” approach was that agglomeration formed a central part of inferential practice with conditionals. I think this is also an advantage that the Lewis account has over the degree-theoretic approach. How exactly to make this case isn’t clear, since it isn’t altogether obvious what the *logic* of the degree theoretic setting should be—but the crucial point is “A>X1″… “A>Xn” can all be good to a very high degree, while “A>X1&…&Xn” are good to a very low degree. Unless we restrict ourselves to starting points which are good to degree 1, then we’ll have to be wary of degradation of degree of goodness while reasoning under counterfactual suppositions, just as on the near miss proposal we’d have to be wary of degradation from truth to faslity. So the Lewisian approach I favour is, I think, the only one of the approaches currently on the table which makes classical reasoning under counterfactual suppositions fully secure.

Defending conditional excluded middle

So things have been a little quiet on this blog lately. This is a combination of (a) trips away, (b) doing administration-stuff for the Analysis Trust, and (c) the fact that I’m entering the “writing up” phase of my current research leave.

I’ve got a whole heap of papers that in various stages of completion that I want to get finished up. As I post drafts online, the blogging should become more regular. So here’s the first installment—a new version of an older paper that discusses conditional excluded middle, and in particular, a certain style of argument that Lewis deploys against it, and which Bennett endorses (in an interestingly varied form) in his survey book.

What I try to do in the present version—apart from setting out some reasons for being interested in conditional excluded middle for counterfactuals that I think deserve more attention—is try to disentangle two elements of Bennett’s discussion. One element is a certain narrow-scope analysis of “might”-counterfactuals (roughly: “if it were that P it might be that Q” has the form: P\rightarrow \Diamond Q —where the modal expresses an idealized ignorance). The second is an interesting epistemic constraint on true counterfactuals I call “Bennett’s Hypothesis”.

One thing I argue is that Bennett’s Hypothesis all on its own conflicts with conditional excluded middle. And without Bennett’s Hypothesis, there’s really no argument from the narrow-scope analysis alone against conditional excluded middle. So really, if counterfactuals work the way Bennett thinks they do, we can forget about the fine details of analyzing epistemic modals when arguing against conditional excluded middle. All the action is with whether or not we’ve got grounds to endorse the epistemic constraint on counterfactual truth.

The second thing I argue is that there are reasons to be severely worried about Bennett’s Hypothesis—it threatens to lead us straight into an error theory about ordinary counterfactual judgements.

If people are interested, the current version of the paper is available here. Any thoughts gratefully received!

CEM journalism

The literature on the linguistics/philosophy interface on conditionals is full of excellent stuff. Here’s just one nice thing we get. (Directly drawn from a paper by von Fintel and Iatridou). Nothing here is due to me: but it’s something I want to put down so I don’t forget it, since it looks like it’ll be useful all over the place. Think of what follows as a bit of journalism.

Here’s a general puzzle for people who like “iffy” analyses of conditionals.

  • No student passes if they goof off.

The obvious first-pass regimentation is:

  • [No x: x is a student](if x goofs off, x passes)

But for a wide variety of accounts, this’ll give you the wrong truth-conditions. E.g. if you read “if” as a material conditional, you’ll get it coming out true if all the students goof and succeed! What is wanted, as Higgenbotham urges, is something with the effect:

  • [No x: x is a student](x goofs off and x passes)

This seems to suggest that under some embeddings “if” expresses conjunction! But that’s hardly what a believer in the iffness of if wants.

What the paper cited above notes is that so long as we’ve got CEM, we won’t go wrong. For [No x:Fx]Gx is equivalent to [All x:Fx]~Gx. And where G is the conditional “if x goofs off, x passes”, the negated conditional “not: if x goofs off, x passes” is equivalent to “if x goofs off, x doesn’t pass” if we have the relevant instance of conditional excluded middle. What we wind up with is an equivalence between the obvious first-pass regimentation and:

  • [All x: x is a student](if x goofs off, x won’t pass).

And this seems to get the right results. What it *doesn’t* automatically get us is an equivalence to the Higgenbotham regimentation in terms of a conjunction (nor with the Kratzer restrictor analysis). And maybe when we look at the data more generally, we’ll can get some traction on which of these theories best fits with usage.

Suppose we’re convinced by this that we need the relevant instances of CEM. There remains a question of *how* to secure these instances. The suggestion in the paper is that rules governing legitimate contexts for conditionals give us the result (paired with a contextually shifty strict conditional account of conditionals). An obvious alternative is to hard-wire in CEM into the semantics, as Stalnaker does. So unless you’re prepared (with von Fintel, Gillies et al) to defend in detail fine-tuned shiftiness of the contexts in which conditionals can be uttered then it looks like you should smile upon the Stalnaker analysis.

[Update: It’s interesting to think how this would look as an argument for (instances of) CEM.

Premise 1: The following are equivalent:
A. No student will pass if she goofs off
B. Every student will fail to pass if she goofs off

Premise 2: A and B can be regimented respectively as follows:
A*. [No x: student x](if x goofs off, x passes)
B*. [Every x: student x](if x goofs off, ~x passes)

Premise 3: [No x: Fx]Gx is equivalent to [Every x: Fx]~Gx

Premise 4: if [Every x: Fx]Hx is equivalent to [Every x: Fx]Ix, then Hx is equivalent to Ix.

We argue as follows. By an instance of premise 3, A* is equivalent to:

C*. [Every x: student x] not(if x goofs off, x passes)

But C* is equivalent to A*, which is equivalent to A (premise 2) which is equivalent to B (premise 1) which is equivalent to B* (premise 2). So C* is equivalent to B*.

But this equivalence is of the form of the antecedent of premise 4, so we get:

(Neg/Cond instances) ~(if x goofs off, x passes) iff if x goofs off, ~x passes.

And we quickly get from the law of excluded middle and a bit of logic:

(CEM instances) (if x goofs off, x passes) or (if x goofs off, ~ x passes). QED.

The present version is phrased in terms of indicative conditionals. But it looks like parallel arguments can be run for CEM for counterfactuals (Thanks to Richard Woodward for asking about this). For one of the controversial cases, for example, the basic premise will be that the following are equivalent:

D. No coin would have landed heads, if it had been flipped.
E. Every coin would have landed tails, if it had been flipped.

This looks pretty good, so the argument can run just as before.]

Must, Might and Moore.

I’ve just been enjoying reading a paper by Thony Gillies. One thing that’s very striking is the dilemma he poses—quite generally—for “iffy” accounts of “if” (i.e. accounts that see English “if” as expressing a sentential connective, pace Kratzer’s restrictor account).

The dilemma is constructed around finding a story that handles the interaction between modals and conditionals. The prima facie data is that the following pairs are equivalent:

  • If p, must be q
  • If p, q

and

  • If p, might be q
  • Might be (p&q)

The dilemma proceeds by first looking at whether you want to say that the modals scope over the conditional or vice versa, and then (on the view where the modal is wide-scoped) looking into the details of how the “if” is supposed to work and showing that one or other of the pairs come out inequivalent. The suggestion in the paper is if we have the right theory of context-shiftiness, and narrow-scope the modals, then we can be faithful to the data. I don’t want to take issue with that positive proposal. I’m just a bit worried about the alleged data itself.

It’s a really familiar tactic, when presented with a putative equivalence that causes trouble for your favourite theory, to say that the pairs aren’t equivalent at all, but can be “reasonably inferred” from each other (think of various ways of explaining away “or-to-if” inferences). But taken cold such pragmatic explanations can look a bit ad hoc.

So it’d be nice if we could find independent motivation for the inequivalence we need. In a related setting, Bob Stalnaker uses the acceptability of Moorean-patterns to do this job. To me, the Stalnaker point seems to bear directly on the Gillies dilemma above.

Before we even consider conditionals, notice that “p but it might be that not p” sounds terrible. Attractive story: this is because you shouldn’t assert something unless you know it to be true; and to say that p might not be the case is (inter alia) to deny you know it. One way of bringing out the pretty obviously pragmatic nature of the tension in uttering the conjunction here is to note that asserting the following sort of thing looks much much better:

  • it might be that not p; but I believe that p

(“I might miss the train; but I believe I’ll just make it”). The point is that whereas asserting “p” is appropriate only if you know that p, asserting “I believe that p” (arguably) is appropriate even if you know you don’t know it. So looking at these conjunctions and figuring out whether they sound “Moorean” seems like a nice way of filtering out some of the noise generated by knowledge-rules for assertion.

(I can sometimes still hear a little tension in the example: what are you doing believing that you’ll catch the train if you know you might not? But for me this goes away if we replace “I believe that” with “I’m confident that” (which still, in vanilla cases, gives you Moorean phenomena). I think in the examples to be given below, residual tension can be eliminated in the same way. The folks who work on norms of assertion I’m sure have explored this sort of territory lots.)

That’s the prototypical case. Let’s move on to examples where there are more moving parts. David Lewis famously alleged that the following pair are equivalent:

  • it’s not the case that: if were the case that p, it would have been that q
  • if were that p, it might have been that ~q

Stalnaker thinks that this is wrong, since instances of the following sound ok:

  • if it were that p, it might have been that not q; but I believe if it were that p it would have been that q.

Consider for example: “if I’d left only 5 mins to walk down the hill, (of course!) I might have missed the train; but I believe that, even if I’d only left 5 mins, I’d have caught it. ” That sounds totally fine to me. There’s a few decorations to that speech (“even” “of course” “only”). But I think the general pattern here is robust, once we fill in the background context. Stalnaker thinks this cuts against Lewis, since if mights and woulds were obvious contradictories, then the latter speech would be straightforwardly equivalent to something of the form “A and I don’t believe that A”. But things like that sounds terrible, in a way that the speech above doesn’t.

We find pretty much the same cases for “must” and indicative “if”.

  • It’s not true that if p, then it must be that q; but I believe that if p, q.

(“it’s not true that if Gerry is at the party, Jill must be too—Jill sometimes gets called away unexpectedly by her work. But nevertheless I believe that if Gerry’s there, Jill’s there.”). Again, this sounds ok to me; but if the bare conditional and the must-conditional were straightforwardly equivalent, surely this should sound terrible.

These sorts of patterns make me very suspicious of claims that “if p, must q” and “if p, q” are equivalent, just as the analogous patterns make me suspicious of the Lewis idea that “if p, might ~q” and “if p, q” are contradictories when the “if” is subjunctive. So I’m thinking the horns of Gillies’ dilemma aren’t equal: denying the must conditional/bare conditional equivalence is independently motivated.

None of this is meant to undermine the positive theory that Thony Gillies is presenting in the paper: his way of accounting for lots of the data looks super-interesting, and I’ve got no reason to suppose his positive story won’t have a story about everything I’ve said here. I’m just wondering whether the dilemma that frames the debate should suck us in.

Edgington vs. Stalnaker

One of the things I’m thinking about at the moment is Stalnaker-esque treatments of indicative conditionals. Stalnaker’s story, roughly, is that indicative conditionals have almost exactly the same truth conditions as (on his theory) counterfactuals do. That is, A>B is true at w iff B is true at the nearest B-world to w. The difference comes only in the fine details about which worlds count as nearest. For counterfactuals, Stalnaker like Lewis thinks that some sort of similarity does the job. For indicatives, Stalnaker thinks that the nearness ordering is rooted in the same similarity metric, but distorted by the following overriding principle: if A and w are consistent with what we collectively presuppose, then the nearest A-worlds will also be consistent with what we collectively presuppose. In the jargon, all worlds outside the “context set” are pushed further out than they would be on the counterfactual ordering.

I’m interested in this sort of “push worlds” modal account of indicatives. (Others in a similar family include Daniel Nolan’s theory, whereby it’s knowledge that does the pushing rather than collective presuppositions). Lots of criticisms of Stalnaker’s theory don’t engage with the fine details of what he says about the closeness ordering, but more general aspects (e.g. its inability to sustain Adams’ thesis that the conditional probability is the probability of the conditional; its handling of Gibbard cases; its sensitivity to fine factors of conversational context). An exception, however, is an argument that Dorothy Edgington puts forward in her SEP survey article (which, by the way, I very much recommend!)

Here’s the case. Let’s suppose that Jill is uncertain how much fuel is in Jane’s car. The tank has a capacity for 100-miles’-worth, but Jill has no knowledge of what level it is at. Jane is
going to drive it until it runs out of fuel. For Jill, the probability of the car being driven for n miles, given that it’s driven for no more than fifty, is 1/50. (for n<51).

Suppose that in fact the tank is full. The most similar worlds to actuality, arguably, are those where the tank is 50 per cent full, and so where Jane drives 50 miles. The same goes for any world where the tank is more than 50 per cent full. So, if nearness of worlds is determined by similarity, the conditional is true as uttered at each of the worlds where the tank is more than 50 per cent full. So without knowing the details of the level of the tank, we should be at least 50 per cent confident that if it goes for under 50 miles, it’ll go for exactly 50 miles. But this seems all wrong. Varying the numbers we can make the case even worse: we should be almost sure of “If it goes for no more than 3 miles, it’ll go for exactly 3 miles”, even though we regard 3, 2, 1 as equiprobable fuel levels.

Of course, that’s only to take into account the comparative similarity of worlds in determining the ordering, and Stalnaker and Nolan have the distorting factor to appeal to: worlds that are incompatible with something we presuppose/know to be true, can be pushed further out. But it doesn’t seem in this case that anything relevant is being presupposed/known.

I don’t think this objection works. To see that something is going wrong, notice that the argument, if successful, would work against other theories too. Consider, for example, Stalnaker’s theory of the counterfactual conditional. Take the case as before, but suppose we’re a day later and Jill doesn’t know how far Jane drove. Consider the counterfactual “Had it stopped after no more than 50 miles, it’d have gone for exactly 50 miles”. By the previous reasoning, the most similar worlds to over-50 worlds are exactly-50 worlds; so we should be half confident of the truth of that conditional. Varying the numbers, we should be almost sure that “If it had gone no more than 3, it’d go exactly 3”, despite regarding the probabilities of 3, 2 and 1 as equally likely. But these all seem like bizarre results.

Moral: the counterfactual ordering of worlds isn’t fixed by the kind of similarity that Edgington appeals to: the sort of similarity whereby a world in which the car stops after 53 miles is more similar to one in which the car stops after 50 miles than one in which the car stops after 3 miles. Of course, in some sense (perhaps an “overall” sense) those similarity judgements are just right. But we know from the Fine/Bennett cases that the sense of similarity that supports the right counterfactual verdicts can’t be all in cases (those cases are ones concerning counterfactuals starting “if Nixon had pushed the nuclear button in the 70’s…” All-in similarity arguably says that closest such worlds are ones where no missiles are released, leading to the wrong results).

Spelling out what the right notion of similarity is is tricky. Lewis gave us one recipe. In effect, we look for a little miracle that’ll suffice to let the counterfactual world diverge from actual history to bring about the antecedent. Then we let events run on according to actual laws, and see what happens. So in worlds where the tank is full, say, let’s look for the little miracle required to to make it run for no more than 50 miles, and run things on. What are the plausible candidates? Perhaps Jane’s decides to take an extra journey yesterday, or forgets to fill up her car two days ago. Small miracles could suffice to get us into those sorts of worlds. But those sorts of divergences don’t really suggest that she’ll end up with exactly 50 miles worth of fuel in the tank, and so this approach undermines the case for “If were at most 50, then exactly 50” being true in antecedent-false worlds. (Which is a good thing!)

If that’s the right thing to say in the counterfactual case, the indicative case too will be sorted. For it’s designed to be a case where presuppositions/knowledge don’t have a relevant distorting effect. And so, once more, the case for “If the car goes for at most 50, then it’ll go for exactly 50” doesn’t work.

I think that the basic interest of push-worlds theories of indicatives like Stalnaker’s and Nolan’s is to connect up the counterfactual and indicative ordering: whether there’s anything informative to say about the counterfactual ordering of worlds itself is an entirely different matter. So if the glosses of the position lead to problems, it’s best to figure out whether the problems lie withthe gloss of the counterfactual ordering (which then should be assessed in connection with that familiar and worked through literature) or with the push-worlds maneuver itself (which has, I think, been less fully examined). I think Edgington’s objection is really connected with the first facet, and I’ve tried to say why I think a more detailed theory will make the problem dissolve. But even if it did turn out to be a problem, the push-worlds thesis itself is still standing.

(Incidentally, I do think Edgington’s setup (which she attributes to a student, James Studd) has wider interest. It looks to me like Jackson’s modal theory of counterfactuals, and Davis’ modal theory of indicatives, both deliver the wrong results in this case.)

[Actually, now I’ve written this out, it strikes me that maybe the anti-Stalnaker argument is fixable. The trick would be to specify the background state of the world to make the result for counterfactual probabilities seem plausible, but such that (given Jill’s ignorance of the background conditions) the indicative probabilities still seem wrong. So maybe the example is at least a recipe for a counterexample to Stalnaker, even if the original case is resistable as described.]