Barker on worlds-semantics for counterfactuals

In a forthcoming paper in Nous, Stephen Barker argues that embedding phenomena mean we should give up the ambition to give an account of the truth-conditions of counterfactuals in terms of possible worlds. Barker identifies an interesting puzzle: but it is obscured by the overly strong claims he makes for it.

Let’s suppose that we have a shuttered window, and Jimmy with a bunch of stones, which (wisely) he leaves unthrown). The counterfactual “Had Jimmy thrown the stone at the window, it would have hit the shutters” is true. But it isn’t necessarily true. For example, if the shutters had been opened a few seconds ago, then had Jimmy thrown the stone, it would have sailed through. Let’s write the first counterfactual as (THROW>SHUTTERS), and the claim we’ve just made as (OPEN>(THROW>SAILEDTHROUGH)). This is a true counterfactual with a counterfactual as consequent, capturing one way in which the latter is contingent on circumstance.

Here is Barker’s puzzle, as it impacts David Lewis’s account of counterfactuals. On Lewis’s account (roughly) a counterfactual is true iff all the closest worlds making the antecedent true also make the conclusion true. Therefore for the double counterfactual above to be true, (THROW>SAILEDTHROUGH) must be true at all the closest OPEN-worlds.

But Lewis also told us something about what makes for closeness of worlds. Cutting a long story short, for the “forward tracking” cases of counterfactuals, we can expect the closest OPEN worlds to exactly match the actual world up to a few seconds ago, whereupon some localized event occurs which is inconsistent with the laws of the actual world, which (deterministically) leads to OPEN obtaining. After this “small miracle”, the evolution of the world continues in accordance. Let’s pick a representative such world, W. (For later reference, note that both THROW and SAILEDTHROUGH are false in W—it’s just a world where the shutters are opened; but Jimmy’s stone doesn’t move).

Barker asks a very good question: is THROW>SAILEDTHROUGH true at W, on Lewis’s view? For this to be so, we need to look at the THROW world nearest to W. What would such a world be? Well, as before, we look for a world that exactly matches W up until shortly before THROW is supposed to occur—which then diverges from W by a small miracle (by the laws of nature in W) in such a way as to bring about THROW, and which from then on evolves in accordance with W’s laws.

But what are W’s laws? Not the actual laws—W violates those. Maybe it has no laws? But then all sorts of crazy evolutions will be treated as “consistent with the laws of W”, and so we’d have no right to assume that in all such evolutions SAILEDTHROUGH would be true. Maybe it has all of W’s laws except the specific one that we needed to violate? But still, if e.g. the actual law tying gravitational force to masses and square separation is violated at W, then removing this from the books allows in all sorts of crazy trajectories correlated with tailored gravitational forces—and again we have no right to think that on all legal evolutions SAILEDTHROUGH will be true. Net result, suggests Barker: we should assume absent further argument that THROW>SAILEDTHROUGH is false at W; and hence OPEN>(THROW>SAILEDTHROUGH) is false. The same recipe can be used to argue for the falsity of all sorts of intuitively true doubly-embedded counterfactuals.

I’ve presented Lewis’s theory very loosely. But the problem survives translation into his more precise framework. The underlying trouble that Barker has identified is this: Lewis’s account of closeness relies on two aspects of similarity to the relevant “base” world: matching the distribution of properties in the base world, and fitting the laws of the base world. But because the “counterfactually selected” worlds violate actual laws, there’s a real question-mark over whether the second respect of similarity has any teeth, when taken with respect to the counterfactually selected world, instead of actuality

This is a nice puzzle for one very specific theory. But does it really show that the worlds approach is doomed? I think that’s very far from the case. Note first that giving a closest-worlds semantics is separable from giving an analysis of closeness in terms of similarity; let alone the kind of analysis that Lewis favoured. So Barker’s problem simply won’t arise for many worlds-theorists. So, I think, Barker’s point is best presented as a problem for one who buys the whole Lewisian package. My second observation is that Lewis himself has the resources to avoid Barker’s case, thanks to his Humean theory of laws. And if we’ve gone so far as to buy the whole Lewisian package on counterfactuals, it won’t be surprising if we’ve gotten ourselves committed to some of the other Lewisian views.

The simple observation that lies behind the first point is that the bare closeness-semantics for counterfactuals does not get us anywhere near talk of miracles, violations of law and the rest. Indeed, to generate the logic of counterfactuals (and thus get predictions about the coherence of various combinations of beliefs with counterfactual content) we do not even need to appeal to the notion of an “intended interpretation” of closeness—we could treat it purely algebraically. If we do believe that one interpretation gives the actual truth conditions of counterfactuals, there’s nothing in principle to stop us treating closeness as primitive (Stalnaker, for example, seems to adopt this methodological stance) or even to give an explicit definition of closeness in counterfactual terms. Insofar as you wanted a reduction of counterfactuality to something else, this’d be disappointing. But whether such reduction is even possible is contentious; and even if it is, it’s not clear that we should expect to read it off our semantics.

So the algebraists, the primitivists, the reverse analyzers can all buy into worlds-semantics for counterfactuals without endorsing anything so controversial as Lewis’s talk of miracles. Likewise, it’s really not clear that anyone going in for a reduction of closeness to something else needs to follow Lewis. Lewis’s project is constrained by all sorts of extrinsic factors. For example, it’s designed to avoid appeal to de jure temporal asymmetries; it’s designed to avoid mention of causation, and so forth and so on. Connectedly, the laws Lewis considers are micro-laws, allowing him to focus on the case of determinism as the paradigm. But what about the (indeterministic) laws of statistical mechanics? If fit with the probabilistic laws of statistical mechanics as well as fit with determinsitic laws play a role in determining closeness, the game changes quite markedly. So there are all sorts of resources for the friend of illuminating reductions of closeness–it’d be a positive surprise if they ended up using only the same sparse resources Lewis felt forced to.

Can we get a version of the dilemma up and running even without Lewis’s particular views? Well, here’s a general thought. Supposing that the actual micro-laws are deterministic, every non-duplicate of actuality is either a universal-past-changer (compared to actuality) or is a law-breaker (with respect to actuality). For if the laws are deterministic, if worlds W and @ match on any temporal segment, they’ll match simpliciter. Now consider the “closest OPEN worlds to actuality” as above. We can see a priori that either this set contains past changers, or law breakers (or possibly both). Lewis, of course, set things up so the latter possibility is realized, and this is what led to Barker’s worries. But if universal-past-changers can’t be among the closest OPEN-worlds, then we’ll be forced to some sort of law-breaking conception.

What’s wrong with past-changing, then? Well, there are some famous puzzle cases if the past is wildly different from the present. Bennett’s “logical links” worry, for example, concerns counterfactuals like the following: “If I’d’ve been fitter, I’d have made it to the top of the hill where the romans built their fort many years ago”. Here, the intuitively true counterfactual involves a consequent containing a definite description, and the description singles out an individual via a past-property. If we couldn’t presuppose that in the counterfactually nearest world, individuals now around retained their actual past properties, these kinds of counterfactuals would be dodgy. But they’re totally smooth. And it’s pretty easy to see that the pattern generalizes. Given a true counterfactual “If p, then a is G”, and some truth about the past Q, we construct a definite description for a that alludes to Q (e.g. “the hill which is located a mile from where, 1000 years ago, two atoms collided on such-and-such trajectories”). The logical links pattern seems to me the strongest case we have against past-changing. If so, then independently of a Lewisian analysis of closeness, we can see that the closest worlds will be lawbreakers not past-changers.

Two things to note here. Universal-past-changers need not be Macro-past-changers. For all we’ve said, there past-changing worlds in the closest OPEN-set might coincide with actuality when it comes to the distribution of thermodynamic properties over small regions of spacetime (varying only on the exact locations etc of the microparticles). For all we’ve said, there may be worlds with same macro-past as actuality in the closest set that are entirely legal. If we have to give up logical-links style cases, but only where the descriptions involve microphysics, then that’s a surprise but perhaps not too big a cost—ordinary counterfactuals (including those with logical links to the macro-past) would come out ok. I’m not sure I’d like to assert that there are such past-changing macro-past duplicators around to play this role; but I don’t see that philosophers are in a position to tell from the armchair that there aren’t any—which is bad news for an a prioristic argument that the worlds-semantics is hopeless. Second, even if we could establish that lawbreaking is the only way to go, that doesn’t yet give enough to give rise to Barker’s argument. Once we have that W (our representative closest OPEN world) is a lawbreaker, the argument proceeds by drawing out the consequences of this for the closest THROW worlds to W. And that step of the argument just can’t be reconstructed, as far as I can see, without appeal to Lewis’s analysis itself. The primitivists, reverse-analysists, alternative-analysists and the like are all still in the game at this point.

So I think that Barker’s argument really can’t plausibly be seen as targetting the worlds-semantics as such. Indeed, I can’t see that it has any dialectical force against Stalnaker, who to say the least looms large in this literature.

But what if one insisted that the target of the criticism is not these unspecified clouds of worlds-theorists, but Lewis himself, or at least those who (perhaps unreflectively) go along with Lewis? To narrow the focus in this way would mean we have to cut out much of the methodological morals that Barker attempts to draw from his case. But it’s still an interesting conclusion—after all, many people do hand wave towards the Lewisian account, and if it’s in trouble, perhaps philosophers will awaken from their dogmatic slumbers and realize the true virtues of the pragmatic metalinguistic account, or whatever else Barker wishes to sell us.
I think this really is where we should be having the debate; the case does seem particularly difficult for Lewis. What I want to argue, however, is that the fallback options Barker argues against, though they may convince others, have little force against Lewis and the Lewisians.

The first fallback option that Barker considers is minimal law-modification. This is where we postulate that W has laws that look very similar to those of the actual world—except they allow for a one-off, specified exception. Suppose that the only violation of @’s laws in W occurs localized in a tiny region R. Then if some universal generalization (for all x,y,z, P) is a law of @, the corresponding W-law will be: (for all x,y,z that aren’t in region R, P). If we want to get more specific, we could add a conjunct saying what *should* happen in region R.

Barker rightly reminds us of Lewis’s Humeanism about laws—what it is to be a law in a world is to be a theorem of the best (optimally simple/informative) system of truths in that world. In the case at hand, it seems that the sort of hedged generalizations just given will be part of such a best system—what’s the alternative? The original non-hedged generalization isn’t even true, so on the official account isn’t in the running to be a law. the hedged generalization is admittedly a little less simple than one might like—though not by much—and leaving it out gives up so much information that it’s hard to see how the system of axioms without the hedged generalization could end up beating a system including the hedged axiom for “best system”. Whether to go for the one that is silent about the behaviour in R, or one that spells it out, turns on delicate issues about whether increased complexity is worth the increased strength that I don’t think we have to deal with here (if I had to bet, I’d go for the silence over specification). Lewis’s account, I think, predicts that minimal mutilation laws are the laws of W. If this is right, then Barker’s argument against this fallback option is inter alia an objection to Lewis’s Humean account of laws. So what is his objection?

Barker’s objection to such minimal mutilation laws is that they don’t fit with “the basic explanatory norms of physical science”. Either we have no explanation for why regularity has a kink at region R, or we have an explanation, but it alludes to region R itself. But if there’s “no explanation” then our law isn’t a real law; and if we have to allude to region R, we are assigning physical (explanatory) significance to particularity, which goes against practice in physical science.

As a Humean, Lewis is up to his ears in complaints from the legal realists that his laws don’t “explain” their instances. So complaints that the laws of W aren’t “real” because they’re not “explanatory” should send up danger-signals. Now, on the option where we build in what happens at R into the W-laws, we at least have a minimal kind of explanation—provability from a digested summary of what actually happens. That’s not a terribly interesting form of explanation, but it’s all we have even in the best case. If the W-laws simply leave a gap, we don’t even have that (though of course, we do have the minimal “explanation” for every other region of space). But who’s worried? It’s no part of Lewis’s conception of laws that they explain things in this sense. Indeed, if you look through his overall programme of vindicating Humean supervenience (exactly the programme, by the way, that motivates the severe constraints on the analysis of counterfactuals) then you’ll see that the theoretical role for laws is essentially exhausted by fixing the truth-conditions of standard counterfactuals—-of course, counterfactuals are used all over the place (not least in articulating Lewis’s favoured view of what it is for one proposition to explain another). And the W-laws seem to play that job pretty well. Another way to put this: for Lewis’s purposes of analyzing closeness, verbal disputes about what is or isn’t a “law” aren’t to the point. Call them Lewis-laws, if you like, stipulatively defined via the best system analysis. If Lewis-laws, Lewis-miracles and the rest successfully analyze closeness, then (assuming that the rest of Lewis’s account goes through) we can analyze closeness ultimately in terms of the Humean supervenience base.

So I think Lewis (and other Humeans) wouldn’t find much to move him in Barker’s account. And—to repeat the dialectical point from earlier—if we’re not Humeans, it’s not clear what the motivation is for trying to analyze closeness in the ultra-sparse way Lewis favours.

But the points just made aren’t entirely local to Lewis and the Lewisians. Suppose we bought into a strong theses that laws must explain their instances, and on this basis both reject Humeanism, and agree with Barker that the hedged generalizations aren’t laws in W. Lewis’s analysis of counterfactual closeness now looks problematic, for exactly the reasons Barker articulates. One reaction, however, is simply to make a uniform substitution in the analysis. Whereever Lewis wrote “law”, we now write “Lewis-law”. Lewis-laws aren’t always laws, we’ll now say, since they don’t explain their instances. Indeed, the explanation for why we have the regularities constitutive of Lewis-laws in the actual is because the real laws are there, pushing and pulling things into order. But Lewis-laws, rather than genuine laws, are what’s needed to analyze closeness, as the Barker cases show us.

To conclude. We should distinguish the general family of worlds-semantics, from the particular analytic project that Lewis was engaged in. Among the family are many different projects, of varying ambition. Barker’s arguments are particular to Lewis’s own proposal. And once we’re dealing with Lewis himself, Barker’s “fallback” position of minimal mutilation laws turns out to be predicted from Lewis’s own Humeanism—and Barker’s objections are simply question-begging against the Humean. Finally, since even a non-Humean can avail themselves of “Lewis laws” for theoretical purposes whenever they prove useful, there is an obvious variant of the Lewis analysis that, for all that’s been said, we can all agree to. Worlds accounts of counterfactuals, in all their glorious variety, are alive and well.

Events and processes

I’ve been reading up on Lewis on causation, and in particular on the account of events he uses. The big thing that his metaphysics of events delivers is a way of getting rid of spurious causal dependence. I say hello, in the course of saying hello abruptly and loudly. I go for a walk, in the course of myself and my girlfriend going for a walk. These patterns of events arise not because one event causes the other, but because one event is part of the other. Pairs of events like these can stand in relations of counterfactual dependence. Suppose I say hello abruptly and loudly. Had I not said hello, then I wouldn’t have said hello abruptly and loudly. So Lewis says: causal dependence between events is not just a matter of counterfactual dependence: it’s counterfactual dependence between distinct events (events that don’t share a part).

When you dig into his metaphysics of events, you see that two notions of parthood are in play. One is broadly logical: event E is a logical part of event F if E occurring in region R entails that F occurs in region R.

A second notion of parthood he uses is spatio-temporal: event E is a st-part of event F if, necessarily, if F occurs in region R, then E occurs some subregion of R. Saying hello abruptly is an l-part of saying hello; my walking is an st-part of myself and my girlfriend walking.

But this still doesn’t cover all cases. Consider Trafalgar and the Napoleonic war. Intuitively, that battle is part of the wars (and not caused by the war, though caused by its earlier parts). But it’s not an l-part, since the region in which the war occurs is more extensive than the region in which the battle occurs. And it’s not an st-part, since the war could have been completed before Trafalgar happened. So Lewis defines up an “accidental” variant of spatio-temporal parthood between occurrent events: E is an a-part of F iff E and F are occurrent, and there’s an occurrent l-part x of E, which is a st-part of some occurrent y which is an l-part of F. I take it the idea is that as well as the Napoleonic wars, there’s another event, the Napoleonic-war-as-it-happened, that is a l-part of the Napoleonic wars; and there’s also Trafalgar-as-it-happened, which is an l-part of Trafalgar. And the latter is an st-part of the former; hence, derivatively, Trafalgar is an st-part of the war.

(Some notes on the interrelation of these notions: if E and F are occurrent, and E is an l-part of F, then it’s an a-part of F (take x=E, y=E). And if E is an st-part of F, then it’s an a-part of F (take x=E, y=F). Rather weirdly, note that when E is an l-part of F, then wherever E occurs, F occurs in an (improper) subregion. Hence F is an st-part of E. And so by the above, if they’re occurrent, we’ll have F an a-part of E. That is, when E is an l-part of F, and both are occurrent, then E and F are a-parts of each other (though of course they may still be distinct).)

Lewis’s requirement that events be “distinct” in order to be candidates for causing one another is that they don’t share a common part in any of these senses.

Lewis notes several times that this would be way too strong a constraint if we allowed events with very rich essences—I’m interested in what this tells us about what sorts of events we can think are hanging around.

Ok: so here is my puzzle. Here’s a first shot—an objection which is plausible but mistaken. Right now, a ball drops, and hits the floor. Consider the conjunctive event or “process” of the ball dropping and hitting the floor. Now (here comes the fallacy) doesn’t this event imply that the ball drops? And so doesn’t that mean the process is an l-part of the ball dropping, and likewise of the ball hitting the floor? But if so, then these two events wouldn’t be distinct, and so couldn’t stand in causal relations. It would be impossible to have a conjunctive process, whose constituents were causally interrelated.

That worried me for a bit but I reckon it’s not a problem. Necessarily, the region *in* which the dropping-and-hitting the floor is a region *within* which the dropping occurs; but it’s not a region *in* which the dropping occurs. “in” is like exact location; an event is then *within* any region it is “in”. But it’s only when every region in which the first occurs is a region *in* which the second occurs, that we have implication or l-parthood. What we have here is just st-parthood, running in the direction you’d have imagined—from constituents to process rather than vice versa.

So that exact puzzle isn’t an objection to Lewis; but I suspect he’s escaped on a technicality, and the underlying trouble with processes will rearise if we tweak the example. Lewis allows for colocated events—and allows they may stand in causal relations. He contemplates a battle of invisible goblins having causal influence on the progress of the AAP conference with which it’s colocated. More seriously, he thinks the presence of an electron in a electric field might cause its acceleration. But the location of the electron, and its acceleration, are colocated events. But in examples of this kind, we really are in trouble if we allow for the conjunctive “process”—the electron-being-so-located-and-accelerating. For necessarily, wherever we have that process in a given region, we have the accelaration *in that region*. So the process is an l-part of the acceleration. Likewise for the locatedness of the electron. But then the two events share a part, and are not distinct—so they couldn’t cause one another!

The trouble for Lewis will arise if we both allow (i) cause and effect to be located in the same region; and (ii) the existence of a “process” encompassing both cause and effect. Lewis says he wants to allow (i); and denying the existence of conjunctive events/processes in (ii) looks unprincipled if we allow them in parallel cases (where the ball drops to the floor). So I conclude there’s pressure on Lewis to rule out conjunctive events/processes across the board.

Loewer on laws

In “Laws and Natural Properties” (Philosophical Topics 2007—I can’t find an online copy to link to) Barry Loewer argues we should divorce Lewis’s Humean account of laws from its appeal to natural properties.

The basic Lewisian idea is something like this. Take all the truths about world w describable in a language NL whose basic predicates pick out perfectly natural properties. There are various deductive systems with true theorems, formulated in this language. Some are simpler than others, some are more informative. The best system optimizes simplicity and strength. The laws are the generalizations, equations, or whatever, entailed by this best system. (This is the basic case—his distinctive treatment of chance requires some tweaks to the setup).

Why the focus on NL? Why not look at any old system in whatever language you like, and pick the simplest/most informative? Lewis worries that the account would then trivialize. Consider the language with a basic predicate F that is interpreted as “being such that T is true”. The single axiom “(Ex)Fx” is then, thinks Lewis, maximally simple, and since its entailments are the same as T, it’s just as informative as T. So simplicity would be no constraint at all, with an appropriate choice of language. What NL does is provide a level playing field: we force the theories to be presented in a common base language, which allows us fairly to compare their complexity.

Loewer notes that the above argument seems pretty questionable. Sure, “informativeness” might be understood just as the modal entailments of the theory—roughly, a theory is more informative the smaller the region of logical space it is true at. But is that the right way to understand informativeness? After all, a sensible seeming physical theory could be applied to some description of a physical situation and produce specific predictions—we can extract a whole range of syntatic consequences of the deductive system relevant to individual situations. Isn’t something like this what we’re after?

Loewer thinks that the right way to extend the Humean project is to take Lewis’s “simplicity and strength” as placeholders for whatever those virtues are that the scientific tradition does in fact value. So he thinks that minimally, if we’re evaluating theories for informativeness, “the information in a theory needs to be extractable in a way that connects with the problems and matters that are of scientific interest”.

I’m not quite sure I understand the next move in the paper. Loewer moves on to say: “Lewis’s argument does show that [Humeanism about laws] requires a preferred language”. That’s a bit surprising, given the above! He goes on to identify the language of scientific English, SL, or its proper successors, SL+. Now, one way to read this is that Loewer is here restricting the languages in which the competing theories can be formulated, not to NL as Lewis did, but to SL or any of the SL+. If we took this line, we can stick with Lewis’s original modal understanding of informativeness I guess–trivialization is ruled out by the same basic Lewisian strategy.

There’s a different way of understanding what’s going on though (and maybe its what Loewer intends). This is to think that the way that we should evaluate informativeness of T is in terms of “truths” that are extractable (logically entailed, for example) from T—the truths that constitute the answers to “problems and matters of scientific interest”. But these truths have to be formulated in a particular language—that’s the cost of the shift from modal characterizations of informativeness to broadly linguistic ones. So as well as the question of what language the theory is in, there’s also the question of the language for presenting the data against which the theory’s virtues are evaluated. There’s nothing that requires the two languages to coincide, and we could insist on a particular formulation of the data-language, while leaving open the theory-language (of course, if the data is to be extractable from the theory in a syntactical sense, then we probably need to add a bunch of coordinative definitions to the theory to link the two vocabularies).

One nice thing about the second way of going is that we don’t have to build in the assumption that the One True system of laws is humanly understandable, or that scientific English or its successors will be adequate to formulate the laws. The first way (where laws are to be formulated in SL+) requires a certain kind of optimism about the cognitive tractability of the underlying explanatory patterns in a world. Lewis’s original theory didn’t require this optimism—NL immediately picks out the fundamental structure of whatever world we’re concerned with, whether or not inhabitants of that world are in a position to figure out what those fundamentals are. Maybe we feel entitled to be optimistic about the actual world—but the Humean account is supposed to apply to arbitrary possible worlds, and surely there are some possible situations out there where SL+ won’t cut it, and some other vocabulary would be called for.

So I prefer the second interpretation of Loewer’s proposal, on which SL+ is the data-language, but the language of theory could be quite different. This suffices, I think, to rebut Lewis’s worry about trivialization. But it allows that in some scenarios, the best system explaining homely facts, is itself quite alien.

A halfway-house between this version of Humeanism and Lewis’s would have the data-language be NL rather than SL+, but allow the language of the final theory to vary. The obvious advantage of this is that it removes the dependence on the contingencies of our scientific language in fixing the laws of arbitrary worlds—strange alien possibilities filled with protoplasm or whatever just might not have a very interesting description in the terms of a language developed in response to our actual situation. Appealing to NL for the data-language tailors informativeness to a description of the world appropriate to the basic features of that world, rather than using one developed in response to the world we happen to find ourselves in.

Let’s consider an example. Suppose that the natural properties are Fieldian, rather than Lewisian. The fundamental features of the world are relations like congruence and betweenness (and similar) that fix the spatio-temporal structure of the world and the mass distribution across it. Now, Field’s “nominalized physics” aims to articulate versions of the standard Newtonian equations in this setting—without appeal to standard resources such as the relation of “having mass of x kg” which brings in appeal to abstracta. Field thinks this “synthetic” formulation should appeal even to those who do not share his qualms about the existence of numbers. Let’s suppose we take his proposal in this spirit, so whatever other problems there may be with the mathematized physics, the worry isn’t that it’s false.

Are the usual mathematized Lagrangian formulations of Newtonian mechanics laws in this Fieldian world? On the original Lewisian proposal about laws, the best system should be formulated in perfectly natural terms—which here means the Fieldian synthetic relations. The natural thought is that the Fieldian nominalistic formulation wins this competition, and its deductive consequence won’t include the usual mathematized equations. So, presumably, the mathematized Lagrangian equation won’t be a law. On the other hand, if we go for either of the tweaked versions above, our candidates for “best theory” needn’t be given in this metaphysically privileged vocabulary. Given appropriate coordinating links between the vocabulary, standard mathematical definitions will entail all the data about mass-congruence and the rest, and so count as informative about the Fieldian data (whether formulated in the Fieldian NL or SL). And (you might argue) going this way enables gains in simplicity, making it the winner in the fight for best theory. So the usual, mathematically laden, Lagrangian may yet be a law. Likewise, a Hamiltonian formulation of mechanics could still be the winner in the race for best theory, and the Hamiltonian equation a law, without us having to claim that it is the simplest around when formulated in the perfectly natural, synthetic terms. More generally, we’re liberated to argue that the basic principles of statistical mechanics should feature in the winning theory, even if its terms are a long way from perfectly natural—so long as they add enough information about (for example) the synthetic perfectly natural truths to justify the extra complexity of adding them in.

Some of the use that the Lewisian account of laws is put to goes over more smoothly, I think, if the data-language is NL rather than SL.  Lewis famously wanted to use the Humean framework to help understand chance. His underlying metaphysics had no primitive chances—simply a distribution of particular outcomes (e.g. there’s an atom at one location, and the results of atomic theory at the next, and a particular statistical distribution among events of this type across space-time, but no primitive “propensity” relating the tokens) On the original account, Lewis liberalized his requirements for the vocabulary of candidate theories, allowing an initially uninterpreted chance operator. Given an appropriate understanding of the “fit” between a chancy theory and a non-chancy world, he thought that chancy theories would win the battle of simplicity and informativeness, grounding chancy laws and thereby the truth of chance talk.

It becomes somewhat tricky to replicate this idea if the data-language is construed to be SL+, as Loewer suggests. Take a world that’s set up with GRW quantum mechanics, with primitive chancy evolution of the wave function. Now, presumably SL+ contains chance talk, and so the data against which theories are to be measured for informativeness includes truths about chance. The original idea was that we could characterize, non-circularly, what made a chance-invoking scientific theory “selected”. But now it turns out that one of the ingredients to selection—informativeness—require appeal to chance. If the data-language in question were NL rather than SL, we wouldn’t face this obstacle.

Overall, I’m not attracted to the version of Humeanism where competitors for best theory must be formulated in SL or SL+—it seems excessively optimistic to think that the laws of a wide enough range of worlds will be formulated in these terms. The version where we appeal to SL+ only in evaluating theories for informativeness looks much more promising. Even so, I’m not sure what we gain from appealing to SL+ rather than NL in the evaluation. Sure, if you were sceptical about appeal to the perfectly natural in the first place, you might be attracted to this as a decent fallback. But I don’t see otherwise what speaks in favour of that.

Psychology without semantics or psychologically loaded semantics?

Here’s a naive view of classical semantics, but one worth investigating. According to this view, semantics is a theory that describes a function from sentences to semantic properties (truth and falsity) relative to a given possible circumstance (“worlds”). Let’s suppose it does this via a two-step method. First, it assigns to each sentence a proposition. Second, it assigns to each proposition, a function from worlds to {True, False} (“truth conditions”).

Let’s focus on the bit where propositions (at a world) are assigned truth-values. One thing that’s leaps out is that the “truth values” assigned have significance beyond semantics. For propositions, we may assume, are the objects of attitudes like belief. It’s natural to think that in some sense, one should believe what’s true, and reject what’s false. So the statuses attributed to propositions as part of the semantic theory (the part that describes the truth-conditions of propositions) are psychologically loaded, in that propositions that have one of the statuses are “to be believed” and those that have the other are “to be rejected”. The sort of normativity involved here is extremely externalistic, of course — it’s not a very interesting criticism of me that I happen to believe p, just on the basis p is false, if my evidence suggested p overwhelmingly. But the idea of an external ought here is familiar and popular. It is often reported, somewhat metaphorically, as the idea that beliefs aims at truth (for discussion, see e.g. Ralph Wedgwood on the aim of belief).

Suppose we are interested in one of the host of non-classical semantic theories that are thrown around  when discussing vagueness. Let’s pick a three-valued Kleene theory, for example. On this view, we have three different semantic statuses that propositions (relative to a circumstance) are mapped to. Call them neutrally A, B and C (much of the semantic theory is then spent telling us how these abstract “statuses” are distributed around the propositions, or sentences which express the propositions). But what, if any, attitude is it appropriate to take to a proposition that has one of these statuses? If we have an answer to this question, we can say that the semantic theory is psychologically loaded (just as the familiar classical setting was).

Rarely do non-classical theorists tell us explicitly what the psychological loading of the various states are. But you might think an answer is implicit in the names they are given. Suppose that status A is called “true”, status C is called “falsity”. Then, surely, propositions that are A are to be believed, and propositions with C are to be rejected. But what of the “gaps”, the propositions that have status B, the ones that are neither true nor false? It’s rather unclear what to say; and without explicit guidance about what the theorist intends, we’re left searching for a principled generalization. One thought is that they’re at least untrue, and so are intended to have the normative role that all untrue propositions had in the classical setting—they’re to be rejected. But of course, we could equally have reasoned that, as propositions that are not false, they might be intended to have the status that all unfalse propositions have in the classical setting—they are to be accepted. Or perhaps they’re to have some intermediate status—-maybe a proposition that has B is to be half-believed (and we’d need some further details about what half-belief amounts to). One might even think (as Maudlin has recently explicitly urged) that in leaving a gap between truth and falsity, the propositions are devoid of psychological loading—that there’s nothing general to say about what attitude is appropriate to the gappy cases.

But notice that these kind of questions are at heart, exegetical—that we face them just reflects the fact that the theorist hasn’t told us enough to fix what theory is intended. The real insight here is to recognize that differences in psychological loading give rise to very different theories (at least as regards what attitudes to take to propositions), which should each be considered on their own merits.

Now, Stephen Schiffer has argued for some distinctive views about what the psychology of borderline cases should be like. As John Macfarlane and Nick Smith have recently urged, there’s a natural way of using Schiffer’s descriptions to fill out in detail one fully “psychologically loaded” degree-theoretic semantics. To recap, Schiffer distinguishes between “standard” partial beliefs (SPBs) which we can assume behave in familiar (probabilistic) ways and have their familiar functional role when there’s no vagueness or indeterminacy at issue. But then we also have special “vagueness-related” partial beliefs (VPBs) which come into play for borderline cases. Intermediate standard partial beliefs allow for uncertainty, but are “unambivalent” in the sense that when we are 50/50 over the result of a fair coin flip, we have no temptation to all-out judge that the coin will land heads. By contrast, VPBs exclude uncertainty, but generate ambivalence: when we say that Harry is smack-bang borderline bald, we are pulled to judge that he is bald, but also (conflictingly) pulled to judge that he is not bald.

Let’s suppose this gives us enough for an initial fix on the two kinds of state. The next issue is to associate them with the numbers a degree-theoretic semantics assigns to propositions (with Edgington, let’s call these numbers “verities”). Here is the proposal: a verity of 1 for p is ideally associated with (standard) certainty that p—an SPB of 1. A verity of 0 for p is ideally associated with (standard) utter rejection of p—an SPB of 0. Intermediate verities are associated with VPBs. Generally, a verity of k for p is associated with a VPB of degree k in p. [Probably, we should say for each verity, both what the ideal VPB and SPB are. This is easy enough: one should have VPBs of zero when the verity is 1 or 0; and SPB of zero for any verity other than 1.]

Now, Schiffer’s own theory doesn’t make play with all these “verities” and “ideal psychological states”. He does use various counterfactual idealizations to describe a range of “VPB*s”—so that e.g. relative to a given circumstance, we can talk about which VPB an idealized agent would take to a given proposition (though it shouldn’t be assumed that the idealization gives definitive verdicts in any but a small range of paradigmatic cases). But his main focus is not on the norms that the world imposes on psychological attitudes, but norms that concern what combinations of attitudes we may properly adopt—-requirements of “formal coherence” on partial belief.

How might a degree theory psychologically loaded with Schifferian attitudes relate to formal coherence requirements? Macfarlane and Smith, in effect, observe that something approximating Schiffer’s coherence constraints arises if we insist that the total partial belief in p (SPB+VPB) is always representable as an expectation of verity (relative to a classical credence distribution over possible situations). We might also observe that component corresponding to Schifferian SPB within this is always representable as the expectation of verity 1 (relative to the same credence). That’s suggestive, but it doesn’t do much to explain the connection between the external norms that we fed into the psychological loading, and the formal coherence norms that we’re now getting out. And what’s the “underlying credence over worlds” doing? If all the psychological loading of the semantics is doing is enabling a neat description of the coherence norms, that may have some interest, but it’s not terribly exciting—what we’d like is some kind of explanation for the norms from facts about psychological loading.

There’s a much more profound way of making the connection: a way of deriving coherence norms from psychologically loaded semantics. Start with the classical case. Truth (truth value 1) is associated with certainty (credence 1). Falsity (truth value 0) is associated with utter rejection (credence 0). Think of inaccuracy as a way of measuring how far a given partial belief is from the actual truth value; and interpret the “external norm” as telling you to minimize overall inaccuracy in this sense.

If we make suitable (but elegant and arguably well-motivated) assumptions about how “accuracy” is to be measured, then it turns out probabilistic belief states emerge as a special class in this setting. Every improbabilistic belief state can be shown to be accuracy-dominated by a probabilistic one—-there’s some particular probability that’ll be necessarily more accurate than the improbability you started with. No probabilistic belief state is dominated in this sense.

Any violations of formal coherence norms thus turns out to be needlessly far from the ideal aim. And this moral generalizes. Taking the same accuracy measures, but applying them to verities as the ideals, we can prove exactly the same theorem. Anything other than the Smith-Macfarlane belief states will be needlessly distant from the ideal aim. (This is generated by an adaptation of Joyce’s 1998 work on accuracy and classical probabilism—see here for the generalization).

There’s an awful lot of philosophy to be done to spell out the connection in the classical case, let alone its non-classical generalization. But I think even the above sketch gives a view on how we might not only psychologically load a non-classical semantics, but also use that loading to give a semantically-driven rationale for requirements of formal coherence on belief states—and with the Schiffer loading, we get the Macfarlane-Smith approximation to Schifferian coherence constraints.

Suppose we endorsed the psychologically-loaded, semantically-driven theory just sketched. Compare our stance to a theorist who endorsed the psychology without semantics—that is, they endorsed the same formal coherence constraints, but disclaimed commitment to verities and their accompanying ideal states. They thus give up on the prospect of giving the explanation of the coherence constraints sketched above. We and they  would agree on what kinds of psychological states are rational to hold together—including what kind of VPB one could rationally take to p when you judge p to be borderline. So they could both agree on the doxastic role of the concept of “borderlineness”, and in that sense give a psychological specification of the concept of indeterminacy. We and they would be allied against rival approaches—say, the claims of the epistemicists (thinking that borderlineness generates uncertainty) and Field (thinking that borderlineness merits nothing more than straight rejection).  The fan of psychology-without-semantics might worry about the metaphysical commitments of his friend’s postulate of a vast range of fine-grained verities (attaching to propositions in circumstances)—metasemantic explanatory demands and higher-order-vagueness puzzles are two familiar ways in which this disquiet might be made manifest. In turn, the fan of psychologically loaded, semantically driven theory might question his friend’s refusal to give any underlying explanation of the source of the requirements of formal coherence he postulates. Can explanatory bedrock really be certain formal patterns amongst attititudes? Don’t we owe an illuminating explanation of why those patterns are sensible ones to adopt? (Kolodny mocks this kind of attitude, in recent work, as picturing coherence norms as a mere “fetish for a certain kind of mental neatness”). That explanation needn’t take a semantically-driven form—but it feels like we need something.

To repeat the basic moral. Classical semantics, as traditionally conceived, is already psychologically loaded. If we go in for non-classical semantics at all (with more than instrumentalist ambitions in mind) we underspecify the theory until we’re told what what the psychological loading of the new semantic values is to be. That’s one kind of complaint against non-classical semantics. It’s always possible to kick away the ladder—to take the formal coherence constraints motivated by a particular elaboration of this semantics, and endorse only these without giving a semantically-driven explanation of why these constraints in particular are in force. Repeating this stance, we can find pairs of views that, while distinct, are importantly allied on many fronts. I think in particular this casts doubt on the kind of argument that Schiffer often sounds like he’s giving—i.e. to argue from facts about appropriate psychological attitudes to borderline cases, to the desirability of a “psychology without semantics” view.

Intuitionism and truth value gaps

I spent some time last year reading through Dummett on non-classical logics. One aim was to figure out what sorts of arguements there might be against combining a truth-value gap view with intuitionistic logic. The question is whether in an intuitionist setting it might be ok to endorse ~T(A)&~T(~A) (The characteristic intuitionistic feature, hereabouts, is a refusal to assert T(A)vT(~A)—which is certainly weaker than asserting its negation. Indeed, when it comes to the law of excluded middle, the intuitionist refuses to assert Av~A in general, but ~(Av~A) is an intuitionistic inconsistency).

On the motivational side: it is striking that in Kripke tree semantics for intuitionistic logic, there are sentences such that neither they nor their negation are “forced”. And if we think of forcing in a Kripke tree as an analogue of truth, that looks like we’re modelling truth value gaps.

A familiar objection to the very idea of truth-value gaps (which appears early on in Dummett—though I can’t find the reference right now) is that asserting the existence of truth value gaps (i.e. endorsing ~T(A)&~T(~A)) is inconsistent with the T-scheme. For if we have “T(A) iff A”, then contraposing and applying modus ponens, we derive from the above ~A and ~~A—contradiction. However, this does require the T-scheme, and you might let the reductio fall on that rather than the denial of bivalence. (Interestingly, Dummett in his discussion of many-valued logics talks about them in terms of truth value gaps without appealing to the above sort of argument—so I’m not sure he’d rest all that much on it).

Another idea I’ve come across is that an intuitionistic (Heyting-style) reading of what “~T(A)” says will allow us to infer from it that ~A (this is based around the thought that intuitionistic negation says “any proof of A can be transformed into a proof of absurdity”). That suffices to reduce a denial of bivalence to absurdity. There are a few places to resist this argument too (and it’s not quite clear to me how to set it up rigorously in the first place) but I won’t go into it here.

Here’s one line of thought I was having. Suppose that we could argue that Av~A entailed the corresponding instance of bivalence: T(A)vT(~A). It’s clear that the latter entails ~(~T(A)&~T(~A))—i.e. given the claim above, the law of excluded middle for A will entail that A is not gappy.

So now suppose we assert that it is gappy. For reductio, suppose Av~A. By the above, this entails that A is not gappy. Contradiction. Hence ~(Av~A). But we know that this is itself an intuitionistic inconsistency. Hence we have derived absurdity from the premise that A is gappy.

So it seems that to argue against gaps, we just need the minimal claim that LEM entails bivalence. Now, it’s a decent question what grounds we might give for this entailment claim; but it strikes me as sufficiently “conceptually central” to the intuitionistic idea about what’s going on that it’s illuminating to have this argument around.

I guess the last thing to point out is that the T-scheme argument may be a lot more impressive in an intuitionistic context in any case. A standard maneuver when denying the T-scheme is to keep the T-rules: to say that A entails T(A), for example (this is consistent with rejecting the T-scheme if you drop conditional proof, as supervaluational and many-valued logicians often do). But in an intuitionistic context, the T-rule contraposes (again, a metarule that’s not good in supervaluational and many-valued settings) to give an entailment from ~T(A) to ~A, which is sufficient to reduce the denial of bivalence to absurdity. This perhaps explains why Dummett is prepared to deny bivalence in non-classical settings in general, but seems wary of this in an intuitionistic setting.

The two cleanest starting points for arguing against gaps for the intuitionist, it seems to me, are either to start with the T-rule, “A entails T(A)” or with the claim “Av~A entails T(A)vT(~A)”. Clearly the first allows you to derive the second. I can’t see at the moment an argument that the second entails the first (if someone can point to one, I’d be very interested), so perhaps basing the argument against gaps on the second is the optimal strategy. (It does leave me with a puzzle—what is “forcing” in a Kripke tree supposed to model, since that notion seems clearly gappy?)

Motivating material conditionals

Soul physics has a post that raises a vexed issue: how to say something to motivate the truth-table account of the material conditional, for people first encountering it.

They give a version of one popular strategy: argue by elimination for the familiar material truth table. The broad outline they suggest (which I think is a nice way to divide matters up) goes like this.

(1) Argue that “if A, B” is true when both are true, and false if A is true and B is false. This leaves two remaining cases to be consider—the cases where A is false.

(2) Argue that none of the three remaining rivals to the material conditional truth table works.

I won’t say much about (1), since the issues that arise aren’t that different from what you anyway have to deal with for motivating truth tables for disjunction, say.

(2) is the problem case. The way Soul Physics suggests presenting this is as following from two minimal observations about the material conditional (i) it isn’t trivial (i.e. it doesn’t just have the same truth values as one of the component sentences) and it’s not symmetric—“if A then B” and “if B then A” can come apart.

In fact, all of the four options that remain at this stage can be informatively described. There’s a truth-function equivalent to A (this is the trivial one); the conjunction of A&B; the biconditional between A and B (these are both symmetric); and finally the material conditional itself.

But there’s something structurally odd about these sort of motivations. We argue by elimination of three options, leaving the material conditional account the winner. But the danger is, of course, that we find something that looks equally as bad or worse with the remaining option, leaving us back where we started with no truth table better motivated than the others.

And the trouble, notoriously, is that this is fairly likely to happen, the moment people get wind of paradoxes of material implication. It’s pretty hard to explain why we put so much weight on symmetry, while (to our students) seeming to ignore the fact that the account says silly things like “If I’m in the US, I’m in the UK” is true.

One thing that’s missing is a justification for the whole truth-table approach—if there’s something wrong with every option, shouldn’t we be questioning our starting points? And of course, if someone raises these sort of questions, we’re a little stuck, since many of us think that the truth table account really is misguided as a way to treat the English indicative. But intro logic is perhaps not the place to get into that too much!

So I’m a bit stuck at this point—at least in intro logic. Of course, you can emphasize the badness of the alternatives, and just try to avoid getting into the paradoxes of material implication—but that seems like smoke and mirrors to me, and I’m not very good at carrying it off. So if I haven’t got more time to go into the details I’m back to saying things like: it’s not that there’s a perfect candidate, but it happens that this works better than others—trust me—so let’s go with it. When I was taught this stuff, I was told about Grice at this point, and I remember that pretty much washing over my head. And its a bit odd to defend widespread practice of using the material conditional by pointing to one possible defence of it as an interpretation of the English indicative that most of us thing is wrong anyway. I wish I had a more principled fallback.

When I’ve got more time—and once the students are more familiar with basic logical reasoning and so on, I take a different approach, one that seems to me far more satisfactory. The general strategy, that replaces (2), at least, of the above, is to argue directly for the equivalence of the conditional “if A, B” with the corresponding disjunction ~AvB. And if you want to give a truth table for the former,  you just read off the latter.

Now, there are various ways of doing this—say by pointing to inferences that “sound good”, like the one from “A or B” to “if not A, then B”. The trouble is that we’re in a similar situation to that earlier—there are inferences that sound bad just nearby. A salient one is the contrapositive: “It’s not true that if A, then B”  doesn’t sound like it implies “A and ~B”. So there’s a bit of a stand-off between or-to-if and not-if-to-and-not.

My favourite starting point is therefore with inferences that don’t just sound good, but for which you see an obvious rationale—and here the obvious candidates are the classic “in” and “out” rules for the conditional: modus ponens and conditional proof. You can really see how the conditional is functioning if it obeys those rules—allowing you to capture good reasoning from assumptions, store it, and then release it when needed. It’s not just reasonable—it’s the sort of thing we’d want to invent if we didn’t have it!

Given these, there’s a straightforward little argument by conditional proof (using disjunctive syllogism, which is easy enough to read off the truth table for “or”) for the controversial direction of equivalence between the English conditional “if A, B” and ~AvB. Our premise is ~AvB. To show the conditional follows, we use conditional proof. Assume A. By disjunctive syllogism, B. So by conditional proof, if A then B.

If you’ve already motivated the top two lines  of the truth table for “if”, then this is enough to fill out the rest of the truth table—that ~AvB entails “if A then B” tells you how the bottom two lines should be filled out. Or you could argue (using modus ponens and reasoning by cases) for the converse entailment, getting the equivalence, at which point you really can read off the truth table.

An alternative is to start from scratch motivating the truth table. We’ve argued that ~AvB entails “if A then B”. This forces the latter to be true whenever the former is.  Hence the three “T” lines of the material conditional truth table—which are the controversial bits. In order that modus ponens hold, we can’t have the conditional true when the antecedent is true and the consequent false, so we can see that the remaining entry in the truth table must be “F”. So between them, conditional proof (via the above argument) and modus ponens (directly) fix each line of the material truth table.

Now I suspect that—for people who’ve already got the idea of a logical argument, assumptions, conclusions and so on—this sort of idea will seem pretty accessible. And the idea that conditionals are something to do with reasoning under suppositions is very easy to sell.

Most of all though, what I like about this way of presenting things is that there’s something deeply *right* about it. It really does seem to me that the reason for bothering with a material conditional at all is its “inference ticket” behaviour, as expressed via conditional proof and modus ponens.  So there’s something about this way of putting things that gets to the heart of things (to my mind).

But, further, this way of looking at things provides a nice comparison and contrast with other theories of the English indicative, since you can view famous options as essentially giving different ways of cashing out the relationship between conditionals and reasoning under a supposition. If we don’t like the conditional-proof idea about how they are related, an obvious next thing to reach for is the Ramsey test—which in a probabilistic version gets you ultimately into the Adams tradition. Stalnakerian treatments of conditionals can be given a similar gloss. Presented this way, I feel that the philosophical issues and the informal motivations are in sync.

I’d really like to hear about other strategies/ways of presenting this—-particular ideas for how to get it across at “first contact”.

Regrets? I’ve had a few…

Just a quick note about something that’s puzzling me.

Frank Arntzenius has a really nice paper (no regrets) in which he gives an interesting argument for causal decision theory. The basic thought is this: if you know that you’ll (by rational means) come to desire something later, you should desire that thing now. (Obviously that formulation needs tightening—see paper for details). He imposes a “desire reflection principle”: your level of desire in p should match your expected level of desire in p, at future time t.

He points out the following. If the desirabilities of various propositions are described by evidential decision theory, then desire-reflection is violated. Suppose you think that in a Newcomb case you desire to 1-box, because desirability goes by EDT value. Suppose you know that before you’re given the money, the distribution of money in the boxes will be revealed. At the point of revelation, you will (by EDT lights) desire that you had two-boxed earlier—no matter what information you receive. So the current ordering of desirability of 1-boxing vs. 2-boxing is reversed when we look at expected future desirability. Desire-reflection rules such scenarios out. Arntzenius argues that CDT (which recommends 2-boxing from the start) won’t violate desire-reflection.

Why care about desire-reflection? Well, it sounds really compelling, to begin with; and if we’re already fans of van Fraassen’s belief-reflection principle, it’d be very natural to take both attitudes to behave in analogous ways in this regard. To motivate it, Arntzenius writes: “If your future self has more information than you do, surely you should listen to her advice, surely you should trust her assessment of the desirabilities of your possible actions.”

But the problem with these sorts of motivation (for me) is that it overgenerates—really for desirability we could substitute any pro-attitude, and we’d find something that sounds equally compelling. If my future self has more information, I should listen to the advice— for example, on what to desire, what to hope for, what to wish for, etc etc.

Here’s my puzzle. There are surely some pro-attitudes that violate desire reflection. EDT surely *does* describe how much you’d like to receive this news rather than that (it’s described by some as “news value”, and that seems like a good name). Suppose I faced the Newcomb situation yesterday, and don’t know which way I acted. Caring only about money, the best news I can receive, given my current poor epistemic state, is that I one-boxed—for I expect to find more money in my bank given that information, then given the alternative. That—let me assure you—-is what I would hope I did (if I cared about being rational, maybe things’d be different—but I only care about money in the bank).

But say I’ll be told at breakfast what the distribution of money in fact was in the Newcomb situation I faced—before being told which way I acted. Once I’ve got that extra piece of info, then no matter which way it goes, I’ll be hoping that I two-boxed—for given the extra distribution information (whatever it is) the news value of 2-boxing will be greater than 1-boxing.

So this is basically just to repeat Arntzenius’s setup, and then asking you to agree that for some pro-attitude—hope, in this case—we violate reflection. We might not like this, but I think it’s pretty pointless to deny it. (After all, it’s not like EDT-values are ill-defined in some way—it’s not like there’s any reason to think it’s *impossible* to adopt propositional attitudes that behave in the way it describes—and, as a matter of fact, I think hoping does in fact work this way).

We needn’t deny there’s some pro-attitude—a different one—that CDT describes. Call that CDT-desire. (I believe David Etlin has a paper arguing that we genuinely have two attitudes hereabouts—I’m looking forward to reading his paper). Hoping violates reflection. CDT-Desiring satisifies it. Pro-attitudes—and the very notion of desirability—seems disanalogous to belief in this regard. For I take it if we’re fans of reflection we really don’t think there’s some kind of representational state belief* that is reflection-violating.

So we need some *discriminating* motivation—something that tells us that desirability *in the sense relevant to rationalizing action* should satisify desire-reflection. If we had something like that, then we could rule out hope, and in favour of CDT-desire, as the relevant notion. But I don’t see we’ve got the tools as yet.

Despite these concerns, there’s seems to me something deeply illuminating in thinking about the EDT/CDT contrast in terms of desire reflection. The problem is, I can’t see yet its distinctive relevance to action. Is there some kind of diachronic coherence constraint on planning for action, specifically, that “wishful thinking” needn’t involve? Why would it matter?

Vagueness survey paper: V (rejecting excluded middle)

So this was the biggest selection-problem I faced: there are so many many-valued systems out there, and so many ways to think about them, which to choose?

I would have liked to talk a bunch on the interpretation of “third truth values”. This seems often glossed over badly to me. In the vagueness literature, it’s often assumed that once we’ve got a third truth value, we might as well be degree theorists. But it seems to me that “gap” interpretations of the third truth value are super-different from the “half-true” interpretation. But to make the case that this is more than a verbal dispute, I think we have to say a whole lot more about the cognitive role of indeterminacy, the role of logic, etc etc. All good stuff (in fact, very close to what I’m working on right now). But I chose not to go that way.

Another thing I could have done is talk directly on degree theories. Nick Smith has a new book length treatment of them, which makes some really nice moves both in motivating and defending the account. And of course they’re historically popular —and “fuzzy logic” is what you always hear talked about in non-philosophical treatments. In Williamson’s big vagueness book, degree theories are really the focus of the chapter corresponding to this section.

On the other hand, I felt is was really important to get a representative of the “logic first” view into the picture—someone who really treated semantics kind of instrumentally, and who saw the point of talking about vagueness in a rather different way to the way it’s often presented in the extant survey books. And the two that sprang to mind here were Hartry Field and Crispin Wright. Of these, Crispin’s intuitionism is harder to set up, and has less connections to other many valued theories. And his theory of a quandary cognitive role, while really interesting, just takes longer to explain than Hartry’s rejectionist suggestion. Wright’s agnosticism is a bit hard to explain too—I take it the view is supposed to be that we’re poised between the Williamsonian style picture, and an option where you assert the negation of bivalence—and the first seems unbelievable and the second incoherent, so we remain agnostic. But if something is incoherent, how can we remain agnostic? (So, actually, I think the better way to present the view is as agnosticism between bivalence endorsing views and Field-style rejectionist views, albeit carried out in an intuitionist rather than Kleene-based system. But if that’s the way to work things out, rejectionism is conceptually prior to agnosticism).

So in the end I started with a minimal intro to the many-valued truth tables, a brief pointer in the direction of extensions and interpretations, and then concentrate on the elements of the Field view—the Quinean translate-and-deflate theory of language, the logical revisionism (and instrumentalism about model theory) and the cognitive role that flows from it.

Just as with the epistemicism section, there are famous objections I just didn’t have room for. The whole issue over the methodological ok-ness of revising logic, burdens that entails… nothing will remain of that but a few references.

Vagueness survey paper: section V

REVISIONARY LOGIC: MANY-VALUED SETTINGS

A distinctive feature of supervaluationism was that while it threw out bivalence (“Harry is bald” is either true or false) it preserved the corresponding instance of excluded middle (“Harry is either bald or not bald”). Revising the logic in a more thorough-going way would allow for a coherent picture where we can finally reject the claim “there is a  single hair that makes the difference between bald and non-bald” without falling into paradox.

“Many valued” logics can be characterized by increasing the number of truth-values we work with—perhaps to three, perhaps infinitely many—and offering generalizations of the familiar stories of how logical constants behave to accommodate this tweak. There are all sorts of ways in which this can be developed, and even more choice points in extracting notions of “consequence” out of the mass of relations that then become available.

Here is a sample many-valued logic, for a propositional language with conjunctions, disjunctions and negations. To characterize the logic, we postulate three values (let’s call them, neutrally, “1” “1/2” and “0”). For the propositional case, the idea will be that each atomic sentence will be assigned some one of the truth values; and then the truth values get assigned to complex sentences recursively. Thus, a conjunction will get that truth value which is the minimum of the truth values of its conjuncts; a disjunction will get that truth value which is the maximum of the truth values of its disjuncts; and a negation will get assigned 1 minus the truth value of the claim negated (you can easily check that, ignoring the value ½ altogether, we get back exactly classical truth-tables.)

A many-valued logic (the strong Kleene logic) is defined by looking at the class of those arguments that are “1-preserving”, i.e. such that when all the premises are value 1, the conclusion is value 1 too. It has some distinctive features; e.g. excluded middle “Av~A” is no longer a tautology, since it can be value ½ when A is value ½. “A&~A” is still treated as a logical contradiction, on the other hand (every sentence whatsoever follows from it), since it will never attain value 1, no matter what value A is assigned.

One option at this point is to take this model theory seriously—much as the classicist and supervaluationist do, and hypothesise that natural language has (or is modelled by?) some many-valued interpretation (or set of interpretations?). This view is a major player in the literature (cite cite cite).

For the remainder of this section, I focus on a different framework in which to view the proposal to revise logic. This begins with the rejection of the very idea of an “intended interpretation” of a language, or a semantic treatment of truth. Rather, one treats truth as a “device of disquotation”—perhaps introduced by means of the kind of disquotational principles mentioned earlier (such a device proves useful, argue its fans, in increasing our expressive power—allowing us to endorse or deny such claims as “everything the pope says is true”). The disquotation principles capture all that there is to be said about truth, and the notion doesn’t need any “model theoretic foundation” in an “intended interpretation” to be in good-standing.

In the first instance, such a truth-predicate is “local”—only carving the right distinctions in the very language for which it was introduced (via disquotation). To allow us to speak sensibly of true French sentences (for example), Field (cite) following Quine (cite) analyzes

“la neige est blanc” is true

as:

there’s a good translation of “la neige est blanc” into English, such that the translated version is disquotationally true.

Alongside this disquotationism is a distinctive attitude to logic. On Field’s view, logical consequence does not need to be “analyzed” in model-theoretic terms. Consequence is taken as primitive, and model theory seen as a useful instrument for characterizing the extension of this relation.

Such disquotationism elegantly avoids the sort of worries plaguing the classical, supervaluational, and traditional many-valued approaches – in particular, since there’s no explanatory role for an “intended interpretation”, we simply avoid worries about how such an intended interpretation might be settled on. Moreover, if model-theory is mere instrument, there’s no pressure to say anything about the nature of the “truth values” it uses.

So far, no appeal to revisionary logic has been made. The utility of hypothesizing a non-classical (Kleene-style) logic rather than a classical one comes in explaining the puzzles of vagueness. For Field, when indeterminacy surfaces, we should reject the relevant instance of excluded middle. Thus we should reject “either Harry is bald or he isn’t”—and consequently, also reject the claim that Harry is bald (from which the former follows, in any sensible logic). We then must reject “Harry is bald and the man with one hair less isn’t” (again because something we reject follows from it). So, from our rejection of excluded middle, we derive the core data behind the “little by little” worry—rejection of the horrible conjunctions (N).

So, like one form of supervaluationism, Field sees borderline cases of vague predicates as characterized by forced rejection. No wonder further inquiry into whether Harry is bald seems pointless. Again in parallel, Field accommodates rejection of (N) without accepting (N*)—and this is at least a start on explaining where our perplexity over the sorites comes from. Unlike the supervaluationist, he isn’t committed to the generalizations “there is some bald guy next to a non-bald guy”—the Kleene logic (extended to handle quantifiers) enforces rejection of this claim.

One central concern about this account of vagueness (setting aside very general worries about the disquotational setting) is whether in weakening our logic we have thrown the baby out with the bathwater. Some argue that it is methodologically objectionable to revise logic without overwhelming reason to do so, given the way that classical assumptions are built into successful, progressive science even when vagueness is clearly in play (think of applications of physics, or the classical assumptions in probability and decision theory, for example). This is an important issue: but let’s set it aside for now.

More locally, the logic we’ve looked at so far seems excessively weak in expressive power. It’s not clear, for example, how one should capture platitudes like “if someone is balder than a bald person, that person too is bald” (translating the “if” here as a kind of disjunction or negated conjunction, as is standard in the classical case, we get something entailing instances of excluded middle we want to reject – we do not seem to yet have in the system a suitable material conditional). For another thing, we haven’t yet said anything about how the central notion “it is determinate whether” fits in. It seems to have interesting logical behaviour—for example, the key connection between excluded middle and indeterminacy would be nicely captured if from Av~A one could infer “it is determinate whether A”. Much of Field’s positive project involves extending the basic Kleene logic to accommodate a suitable conditional and determinacy operators, in particular to capture thoroughly “higher-order” kinds of vagueness (borderline cases of borderline cases, and so on).

Vagueness survey IV: supervaluations

Ok, part IV of the survey article. This includes the second of three sample theories: and I’ve chosen to talk about what’s often called “the most popular account of vagueness”: supervaluationism.

I’m personally pretty convinced there are (at least) two different *types* of theses/theories that get the label “supervaluationism” (roughly, the formal logic-semantics stuff, and the semantic indecision story about the source of indeterminacy). And even once you have both on board, I reckon there are at least three *very different* ways of understanding what the theory is saying (that’s separate from, but related to, the various subtle issues that Varzi picks out in the recent Mind paper).

But what I want to present is one way of working out this stuff, so I’m keeping fairly close to the Fine-Keefe (with a bit of Lewis) axis that I think is most people’s reference point around here. I try to strip out the more specialist bits of Fine—which comes at a cost, since I don’t mention “penumbral connection” here. But again, I wanted to keep the focus on the basics of the theory, and the application to the central puzzles, so stuff had to come out.

One thing I’m fairly unapologetic about is to present supervaluationism as a theory where truth=supertruth. It seems terminologically bizarre to do otherwise—given that “supervaluations” have their origins in a semantic-logico technique for defining truth, when you have multiple truth-on-i to play with. I’m happy to think that “semantic indecision” views can be paired with a classical semantics, and the semantics for definiteness given in terms of delineations/sharpenings (as indeed, can the epistemicist D-operator). But, as a matter of terminology, I don’t see these as “supervaluational”. “Ambiguity” views like McGee and McLaughlin are a mixed case, but I don’t have room to fit them in. In any case: bivalence failure is such a common thought that I thought it was worth giving it a run for its money, straightforwardly construed, and I do think that it does lead to interesting and distinctive positions on what’s going on in the sorites.

Actually, the stuff about confusing rejection with accepting-negation is something that’s again a bit of a choice-point. I could have talked some about the “confusion hypothesis” way of explaining the badness of the sorites (and work in particular by Fine, Keefe and Greenough on this—and a fabulous paper by Brian Weatherson that he’s never published “Vagueness and pragmatics”).  But when I tried with this it’s a bit tricky to explain, takes quite a bit of thinking about—and I felt the rejection stuff (also a kind of “confusion hypothesis”) was more organically related to the way I was presenting things. I need to figure out some references for this. I’m sure there must be lots of people making the “since it’s untrue, we’re tempted to say ~p” move, which is essentially what’s involved. I’m wondering whether Greg Restall has some explicit discussion of the temptation of the sorites in his paper on denial and multiple conclusions and supervaluations…

One place below where a big chunk of stuff had to come out to get me near the word limit, was material about the inconsistency of the T-scheme and denials of bivalence. It fitted in very naturally, and I’d always planned that stuff to go in (just because it’s such a neat and simple little argument). But it just didn’t fit, and of all the things I was looking at, it seemed the most like a digression. So, sadly, all that remains is “cite cite cite” where I’m going to give the references.

One thing that does put in a brief appearance at the end of the section is the old bugbear: higher order vagueness. I don’t discuss this very much in the piece, which again is a bit weird, but then it’s very hard to state simply what the issue is there (especially as there seems to be at least three different things going by that name in the literature, the relations between them not being very obvious).

Another issue that occurred to me here is whether I should be strict in separating/distinguishing semantics and model theory. I do think the distinction (between set-theoretic interpretations, and axiomatic specifications of those interpretations) is important, and in the first instance what we get from supervaluations is a model theory, not a semantics. But in the end it seemed not to earn its place. Actually: does anyone know of somewhere where a supervaluational *axiomatic semantics* is given (as opposed to supervaluational models?) I’m guessing it’ll look something like: [“p”]=T on d iff At d, p.—i.e. we’ll carry the relativization through the axioms just as we do in the modal case.

The section itself:

Survey paper on vagueness: part IV

REVISIONARY SEMANTICS: SUPERVALUATIONS

A very common thought about borderline cases is that they’re neither true nor false. Given that one can only know what is true, this would explain our inevitable lack of knowledge in borderline cases. It’s often thought to be a rather plausible suggestion in itself.

Classical semantics builds in the principle that each meaningful claim is either true or false (bivalence). So if we’re to pursue the thought that borderline claims are truth value gaps, we must revise our semantic framework to some extent. Indeed, we can know in advance that any semantic theory with truth-value gaps will diverge from classical semantics even on some of the most intuitively plausible (platitudinous seeming) consequences: for it can be shown under very weak assumptions that truth value gaps are incompatible with accepting disquotational principles such as: “Harry is bald” is true if and only if Harry is bald (see cite cite cite).

How will the alternation of the classical framework go? One suggestion goes under the heading “supervaluationism” (though as we’ll see, the term is somewhat ambiguous).

As an account of the nature of vagueness, supervaluationism is a view on which borderlineness arises from what we might call “semantic indecision”. Think of the sort of things that might fix the meanings of words: conventions to apply the word “bald” to clear cases; conventions to apply “not bald” to clear non-cases; various conventions of a more complex sort—for example, that anyone with less hair than a bald person should count as bald. The idea is that when we list these and other principles constraining the correct interpretation of language, we’ll be able to narrow down the space of acceptable (and entirely classical) interpretations of English down a lot—but not to the single intended interpretation hypothesized by classical semantics. At best, what we’ll get is a cluster of candidates. Let’s call these the sharpenings for English. Each will assign to each vague predicate a sharp boundary. But very plausibly the location of such a boundary is something the different sharpenings will disagree about. A sentence is indeterminate (and if it involves a vague predicate, is a borderline case) just in case there’s a sharpening on which it comes out true, and another on which it comes out false.

As an account of the semantics of vague language, the core of the supervaluationist proposal is a generalization of the idea found in classical semantics, that for something to be true is for it to be true at the intended interpretation. Supervaluationism offers a replacement: it works with a set of “co-intended interpretations”, and says that for a sentence to be true, it must be true at all the co-intended interpretations (this is sometimes called “supertruth”). This dovetails nicely with the semantic indecision picture, since we can take the “co-intended interpretations” to be what we called above the sharpenings—and hence when a sentence is indeterminate (true on one sharpening and false on another) neither it nor its negation will be true: and hence we have a truth value gap. (The core proposal for defining truth finds application in settings where “semantic indecision” idea seems inappropriate: see for example Thomason’s treatment of the semantics of branching time in his (cite)).

The slight tweak to the classical picture leaves a lot unchanged. Consider the tautologies of classical logic, for example. Every classical interpretation will make them true; and so each sharpening is guaranteed to make them true. Any classical tautology will be supertrue, therefore. So – at least at this level – classical logic is retained (It’s a matter of dispute whether more subtle departures from classical logic are involved, and whether this matters: see (cite cite cite)).

So long as (super)truth is a constraint on knowledge, supervaluationists can explain why we can’t know whether borderline bald Harry is bald. On some developments of the position, they can go interestingly beyond this explanation of ignorance. One might argue that insofar as one should only invest credence in a claim to the extent one believes it true, obvious truth-value-gaps are cases where we should utterly reject (invest no credence in) both the claim and its negation. This goes beyond mere lack of knowledge, for it means the information that such-and-such is borderline gives us a direct fix on what our degree of belief should be in such-and-such (by contrast, on the epistemicist picture, though we can’t gain knowledge, we’ve as yet no reason to think that inquiry could raise or lower the probability we assign to Harry being bald, making investigation of the point perfectly sensible, despite initial appearances).

What about the sorites? Every sharpening draws a line between bald and non-bald, so “there is a single hair that makes the difference between baldness and non-baldness” will be supertrue. However, no individual conjunction of form (N) will be true—many of them will instead be truth value gaps, true on some sharpenings and false on others (this highlights one of the distinctive (disturbing?) features of supervaluationist—the ability of disjunctions and existential generalizations to be true, even if no disjunct or instance is.) As truth-value gaps, instances of (N*) will also fail to be true, so some of the needed premises for the sorites paradox are not granted.

(There is one thing that some supervaluationists can point to in attempt to explain the appeal of the paradoxical premises. Suppose that—as I think is plausible—we take as the primary data in the sorites the horribleness of the conjunctions (N). These are untrue, and so (for one kind of supervaluationist) should be utterly rejected. It’s tempting, though mistaken, to try to express that rejection by accepting a negated form of the same claim—that is the move that takes us from the rejection of each of (N) to the acceptance of each of (N*). This temptation is one possible source of the “seductiveness” of sorites reasoning.)

Two points to bear in mind about supervaluationism. First, the supervaluationist endorses the claim that “there is a cut-off” — a pair of men differing by only one hair, with the first bald and the second not. Insofar as one considered that first-order claim to be what was most incredible about (say) epistemicism, one won’t feel much advance has been made. The supervaluationist must try to persuade you that once one understands the sense in which “there’s no fact of the matter” where that cut-off is, the incredulity will dissipate.  Second, many want to press the charge that the supervaluationist makes no progress over the classicist, for reasons of “higher order vagueness”. The thought is the task of explaining how a set of sharpenings gets selected by the meaning fixing facts is no easier or harder than explaining how a single classical interpretation gets picked out. However, (a) the supervaluationist can reasonably argue that if she spells out the notion of “sharpening” in vague terms, she will regard the boundary between the sharpenings and non-sharpenings as vague (see Keefe (cite)); (b) even if both epistemicist and supervaluationist were both in some sense “committed to sharp boundaries”, the account they give of the nature of vagueness is vastly different, and we can evaluate their positive claim on its own merits.

Vagueness survey paper III (epistemicism)

This is the third section of my first draft of a survey paper on vagueness, which I’m distributing in the hope of getting feedback, from the picky to the substantive!

In the first two sections, I introduced some puzzles, and said some general methodology things about accounting for vague language—in effect, some of the issues that come up in giving a theory in a vague metalanguage. The next three sections are the three sample accounts I look at. The first, flowing naturally on from “textbook semantic theories”, is the least revisionary: semantics in a classical setting.

Now, if I lots of time, I’d talk about Delia Graff Fara’s contextualism, the “sneaky classicism” of people like McGee and McLaughlin (and Dave Barnett, and Cian Dorr, and Elizabeth Barnes [joint with me in one place!]). But there’s only so much I can fit in, and Williamson’s epistemicism seems the natural representative theory here.

Then there’s the issue of how to present it. I’m always uncomfortable when people use “epistemicism” as synonym for just the straightforward reading of classical semantics—sharp cut-offs and so on (somebody suggested the name “sharpism” for that view–I can’t remember who though). The point about epistemicism—why people finding it interesting—is surely not that bit of it. It’s that it seems to predict where others retrodict; and it seems principled where others seem ad hoc. Williamson takes formulations of parts of the basic toolkit of epistemology—safety principles; and uses these to explain borderineness (and ultimately the higher-order structure of vagueness). That’s what’s so super-cool about it.

I’m a bit worried in the current version I’ve downplayed the sharpist element so much. After all, that’s where a lot of the discussion has gone on. In part, that betrays my frustration with the debate—there are some fun little arguments around the details, but on the big issue I don’t see that much progress has been made. It feels to me like we’ve got a bit of  a standoff. At minimum I’m going to have to add a bunch of references to this stuff, but I wonder what people think about the balance as it is here.

I have indulged a little bit in raising one of the features that always puzzles me about epistemicism: I see that Williamson has an elegant explanation about why we can’t ever identify a cut-off. But I just don’t see what the story is about why we find the existential itself so awful. The analogy to lottery cases seems helpful here. Anyway, on with the section:

Vagueness Survey Paper, Part III.

VAGUENESS IN A CLASSICAL SETTING: EPISTEMICISM

One way to try to explain the puzzles of vagueness look to resources outwith the philosophy of language. This is the direction pursued by epistemicists such as Timothy Williamson.

One distinctive feature of the epistemicist package is retaining classical logic and semantics. It’s a big advantage of this view that we can keep textbook semantic clauses described earlier, as well as seemingly obvious truths such as that “Harry is bald” is true iff Harry is bald (revisionary semantic theorists have great trouble saving this apparently platudinous claim). Another part of the package is a robust face-value reading of what’s involved in doing this. There really is a specific set that is the extension of “bald”—a particular cut-off in the sorites series for bald, and so on (some one of the horrible conjunctions given earlier is just true). Some other theorists say these things but try to sweeten the pill—to say that admitting all this is compatible with saying that in a strong sense there’s no fact of the matter where this cut-off is (see McGee McLaughlin; Barnett; Dorr; Barnes). Williamson takes the medicine straight: incredible as it might sound, our words really do carve the world in a sharp, non-fuzzy way.

The hard-nosed endorsement of classical logic and semantics at a face-value reading is just scene-setting: the real task is to explain the puzzles that vagueness poses. If the attempt to make sense of “no fact of the matter” rhetoric is given up, what else can we appeal to?

As the name suggests, Williamson and his ilk appeal to epistemology to defuse the puzzle. Let us consider borderlineness first. Start again from the idea that we are ignorant of whether Harry is bald, when he is a borderline case. The puzzle was to explain why this was so, and why the unknowability was of such a strong and ineliminable sort.

Williamson’s proposal makes use of a general constraint on knowledge: the idea that in order to know that p, it cannot be a matter of luck that one’s belief that p is true. Williamson articulates this as the following “safety principle”:

For “S knows that p” to be true (in such situation s), “p” must be true in any marginally different situation s* (where one forms the same beliefs using the same methods) in which “S believes p” is true.

The idea is that the situations s* represent “easy possibilities”: falsity at an easy possibility makes a true belief too lucky to count as knowledge.

This first element of Williamson’s view is independently motivated epistemology. The second element is that the extensions of vague predicates, though sharp, are unstable. They depend on exact details of the patterns of use of vague predicates, and small shifts in the latter can induce small shifts in the (sharp) boundaries of vague predicates.

Given these two, we can explain our ignorance in borderline cases. A borderline case of “bald”  is just one where the boundary of “bald” is close enough that a marginally different pattern of usage could induce a switch from (say) Harry being a member of the extension of “bald” to not being in that extension. If that’s the case, then even if one truly believed that Harry was bald, there will be an easy possibility where one forms the same beliefs for the same reasons, but that sentence is false. Applying the safety principle,  the belief can’t count as knowledge.

Given that the source of ignorance resides in the sharp but unstable boundaries of vague predicates, one can see why gathering information about hair-distributions won’t overcome the relevant obstacle to knowledge. This is why the ignorance in borderline cases seems ineliminable.

What about the sorites? Williamson, of course, will say that one of the premises if false—there is a sharp boundary, we simply can’t know what that is. It’s unclear whether this is enough to “solve” the sorites paradox however. As well as knowing what premise to reject, we’d like to know why we found the case paradoxical in the first place. Why do we find the idea of a sharp cut off so incredible (especially since there’s a very simple, valid argument from obvious premises to this effect available)? Williamson can give an account of why we’d never feel able to accept any one of the individual conjunctions (Man n is bald and man n+1 is not). But that doesn’t explain why we’re uneasy (to say the least) with the thought that some such conjunction is true—i.e. that there is a sharp cut-off. I’ll never know in advance which ticket will win a lottery; but I’m entirely comfortable with the thought that one will win. Why don’t we feel the same about the sorites?