Category Archives: Probability

Justifying scoring rules

In connection to this paper, I’ve been thinking some more about what grounds we might have for saying substantive things about how “scoring rules” should behave.

Quick background. Scoring rules rank either credences in a single proposition, or whole credence functions (depending on your choice of poison) against the actual truth values. For now, let’s concentrate on the single-proposition case. In the context we’re interested in, they’re meant to measure “how (in)accurate” the credences are. I’ll assume that scoring rules take the form s(x,y), where x is the credence, and y the truth value of the salient proposition (1 for truth, 0 for falsity). You’d naturally expect a minimal constraint to be:

(Minimal 1)  s(1,1)=s(0,0)=1; s(0,1)=s(1,0)=0.

(Minimal 2) s is a monotone increasing function in x when y=1. s is a monotone decreasing function in x when y=0.

Basically, this just says that credences 1 and 0 are maximally and minimally accurate, and you never decrease in accuracy by moving closer to the truth value.

But to make arguments from scoring rules for probabilism run, we need a lot more structure. Where do we get it from?

There’s a prior question: what’s the nature of a scoring rule in the first place? There’re a couple of thoughts to have here. One is that scoring rules are *preferences* of agents. Different agents can have different scoring rules, and the relevant preference-ordering aims to capture the subjective value the agent attaches to having *accurate* credences.

Now, various hedges are needed at this point. Maybe having certain credences make you feel warm and fuzzy, and you prefer to have those feelings no matter what. We need to distill that stuff out. Moreover, maybe you value having particular credences in certain situations  because of their instrumental value—e.g. enabling you indirectly to get lots of warm fuzzy stuff. One strong thesis about scoring rules is that they give the *intrinsic* value that the agent attaches to a certain credence/truth value state of affairs—her preferences given that alethic accuracy is all she cares about. However tricky the details of this are to spell out, the general story about what the scoring rule aim to describe is pretty clear—part of the preferences of individual agents.

A different kind of view would have it that the scoring rule describes a more objective beast: facts about which credences are better than which others (as far as accuracy goes). Presumably, if there are such betterness facts, this’ll provide a standard for assessing people’s alethic preferences in the first sense.

On either view, the trick will be to justify the claim that the scoring rule have certain formal features X. Then one appeals to a formal argument that shows that for every incoherent credence c, there’s a coherent credence d which is more accurate (by the lights of the scoring rule) than c no matter what the actual truth values are—supposing only that the scoring rule has feature X. Being “accuracy dominated” in this way is supposed to be an epistemic flaw (at least a pro tanto one). [I’m going to leave discussion of how *that* goes for another time]

Ok. But how are we going to justify features of scoring, other than the minimal constraints above? Well, Joyce (1998) proceeds by drawing out what he regards as unpleasant consequences of denying a series of formal constraints on the scoring rule. Though it’s not *immediately obvious* that to be a “measure of accuracy” scoring rules need to do more than satisfy *minimal*, you may be convinced by the cases that Joyce makes. But what *kind* of case does he make? One thought is that it’s a kind of conceptual analysis. We have the notion of accuracy, and when we think carefully through what can happen if a measure doesn’t have feature X, we see that whatever its other merits, it wouldn’t be a decent way to measure anything deserving the name *accuracy*.

Whether or not Joyce’s considerations are meant to be taken this way (I rather suspect not), it’s at least a very clean project to engage in. Take scoring rules to be preferences. Then a set of preferences that didn’t have the formal features just wouldn’t be preferences solely about accuracy—as was the original intention. Or take an objective betterness ordering. If it’s evaluating credence/world pairs on grounds of accuracy, again (if the conceptual analysis of accuracy was successful) it better have the features X, otherwise it’s just not going to deserve the name.

But maybe we can’t get all the features we need through something like conceptual analysis. One of Joyce’s features—convexity—seems to be something like a principle of epistemic conservativism (that’s the way he has recently presented it). It doesn’t seem that people would be conceptually confused if they took their alethic preferences didn’t violate this principle. Where would this leave us?

If we’re thinking of the scoring rule as an objective betterness relation, then there seems plenty of room for thinking that the *real facts* about accuracy encode convexity, even if one can coherently doubt that this is so (ok, so I’m setting aside open-question arguments here, but I was never terribly impressed by them). And conceptual analysis is not the only route to justifying claims that the one true scoring rule has such a feature. Here’s one alternative. It turns out that a certain scoring rule—the Brier score—meets all Joyce’s conditions and more besides. And it’s a very simple, very well behaved scoring rule, that generalizes very nicely in all sorts of ways (Joyce (2009) talks about quite a few nice features of it in the section “homage to the Brier score”). It’s not crazy to think that, among parties agreed that there is some “objective accuracy” scoring rule out there to be described, considerations of simplicity, unity, integration and other holistic merits might support the view that the One True measure of (in)accuracy is given by the Brier score.

But this won’t sound terribly good if you think that scoring rules describe individual preferences, rather than an objective feature that norms those preferences. Why should theoretical unification and whatnot give us information about the idiosyncracies of what people happen to prefer? If we give up on the line that it’s just conceptually impossible for there to be “alethic preferences” that fail to satisfy conditions X, then why can’t someone—call him Tommy—just happen to have X-violating alethic preferences? Tommy’s “scoring rule” then just can’t be used in a vindication of probabilism. I don’t see how the kind of holistic considerations just mentioned can be made relevant.

But maybe we could do something with this (inspired by some discussion in Gibbard (2008), though in a very different setting). Perhaps alethic preferences only need to satisfy the minimal constraints above, to deserve the name. But even if its *possible* to have alethic preferences with all sorts of formal properties, it might be unwise to do so. Maybe things go epistemically badly, e.g. if they’re not appropriately conservative because of their scoring rule (for an illustration, perhaps the scoring rule is just the linear one: s(x,y) is the absolute difference of x and y. This scoring rule motivates extremeism in credences: when c(p)>0.5, you minimize expected inaccuracy by moving your credence to 1. But someone who does that doesn’t seem to be functioning very well, epistemically speaking). Maybe things go prudentially badly, unless their alethic values have a certain form. So, without arguing that it’s analytic of “alethic preference”, we provide arguments that the wise will have alethic preferences that meet conditions X.

If so, it looks to me like we’ve got an indirect route to probabilism. People with sensible alethic preferences will be subject to the Joycean argument—they’ll be epistemically irrational if they don’t conform to the axioms of proability. And while people with unwise alethic preferences aren’t irrational in failing to be probabilists, they’re in a bad situation anyway, and (prudentially or epistemically) you don’t want to be one of them.It’s not that we have a prudential justification of probabilism. It’s that there are (perhaps prudential) reasons to be the kind of person such that its then epistemically irrational to fail to be a probabilist.

Though on this strategy, prudential/pragmatic considerations are coming into play, they’re not obviously as problematic as in e.g. traditional formulations of Dutch book arguments. For there, the thought was that if you fail to be a probabilist, you’re guaranteed to lose money. So, if you like money, be a probabilist! Here the justification is of the form: your view about the value of truth and accuracy is such-and-such. But you’d be failing to live up to your own preferences unless you are a probabilist. And it’s at a “second order” level, where we explain why it’s sensible to value truth and accuracy in the kind of way that enables the argument to go through, that we appeal to prudential considerations.

Having said all that, I still feel that the case is cleanest for someone thinking of the scoring argument as based on objective betterness. Moreover, there’s a final kind of consideration that can be put forward there, which I can’t see how to replicate on the preference-based version. It turns on what we’re trying to provide in giving a “justification of probabilism”. Is the audience one  of sympathetic folk, already willing to grant that violations of probability axioms are pro tanto bad, and simply wanting it explained why this is the case (NB: the pragmatic nature of the Dutch Book argument makes it as unsatisfying for such folk as it is for anyone else). Or is the audience one of hostile people, with their own favoured non-probabilistic norms (maybe people who believe in Dempster-Shafer theory of evidence)? Or the audience people who are suitably agnostic, initially?

This makes quite a big difference. For suppose the task was to explain to the sympathetic folk what grounds the normativity of the probability axioms. Then we can take as a starting point, that one (pro tanto) ought not to violate the probability axioms. We can show how objective betterness, if it has the right form, would explain this. We can show that an elegant scoring rule like the Brier score would have the right form, and so provide the explanation. And absent competitors, it looks like we’ve got all the ingrediants for a decent inference-to-the-best-explanation for the Brier Score seen as the best candidate for measuring objective (in)accuracy.

Of course, this would cut very little ice with the hostile crowd, who’d be more inclined to tollens away from the Brier score. But even they should appreciate the virtues of being presented with a package deal, with probabilism plus an accuracy/Brier based explanation of what kind of normative force the probability axioms have. If this genuinely enhances the theoretical appeal of probabilism (which I think it does) then the hostile crowd should feel a certain pressure to try to replicate the success—if only to try to win over the neutral.

Of course, the sense in which we have a “justification” of probabilism is very much less than if we could do all the work of underpinning a dominance argument by conceptual analysis, or even pointing to holistic virtues of the needed features. It’s more on the lines of explaining the probabilist point of view, than persuading others to adopt it. But that’s far from nothing.

And even if we only get this, we’ve got all we need for other projects  in which I, at least, am interested. For if, studying the classical case, we can justify Brier as a measure of objective accuracy, then when we turn to generalizations of classicism—non-classical semantics of the kind I’m talking about in the paper—then we run dominance arguments that presuppose the Brier measure of inaccuracy, to argue for analogues of probabilism in the non-classical setting. And I’d be happy if the net result of that paper was the conditional: to the extent that we should be probabilists in the classical setting, we should be analogue-probabilists (in the sense I spell out in the paper) in the non-classical setting. So the modest project isn’t mere self-congratulation on the part of probabilists—it arguably commits them to a range of non-obvious generalizations of probabilism in which plenty of people should be interested.

Of course, if a stronger, more suasive case for the features X can be made, so much the better!

In lieu of new blogposts, I thought I’d post up drafts of two papers I’m working on. They’re both in fairly early stages (in particular, the structure of each needs quite a bit of sorting out. But as they’re fairly techy, I think I’d really benefit from any trouble-shooting people were willing to do!

The first is “Degree supervaluational logic“. This is the kind of treatment of indeterminacy that Edgington has long argued for, and it also features in work from the 70’s by Lewis and Kamp. Weirdly, it isn’t that common, though I think there’s a lot going for it. But it’s arguably implicit in a lot of people’s thinking about supervaluationism. Plenty of people like the idea that the “proportion of sharpenings on which a sentence is true” tells us something pretty important about that sentence—maybe even serving to fix what degree of belief we should have in it. If proportions of sharpenings play this kind of “expert function” role for you, then you’re already a degree-supervaluationist in the sense I’m concerned with, whether or not you want to talk explicitly about “degrees of truth”.

One thing I haven’t seen done is to look systematically at its logic. Now, if we look at a determinacy-operator free object language, the headline news is that everything is classical—and that’s pretty robust under a number of ways of defining “validity”. But it’s familiar from standard supervaluationism that things can become tricky when we throw in determinacy operators. So I look at what happens when we add in things like “it is determinate to degree 0.5 that…” into our object-language. What happens now depends *very much* on how validity is defined. I think there’s a lot to be said for “degree of truth preservation” validity—i.e. the conclusion has to be at least as true as the premises. This is classical in the determinacy-free language. And its “supraclassical” even when those operators are present—every classically valid argument is still valid. But in terms of metarules, all hell breaks loose. We get failures of conjunction introduction, for example; and of structural rules such as Cut. Despite this, I think there’s a good deal to be said for the package.

The second paper “Gradational accuracy and non-classical semantics”  is on Joyce’s work on scoring functions. I look at what happens to his 1998 argument for probabilism, when we’ve got non-classical truth-value assignments in play. From what I can see, his argument generalizes very nicely. For each kind of truth-value assignment, we can characterize a set of “coherent” credences, and show that for any incoherent credence there is a single coherent credence which is more accurate than it, no matter what the truth-values turn out to be.

In certain cases, we can relate this to kinds of “belief functions” that are familiar. For example, the class of supervaluationally coherent credences I think can be shown to be Dempster-Shafer belief functions—at least if you define supervaluational “truth values” as I do in the paper.

As I mentioned, there are certainly some loose ends in this work—be really grateful for any thoughts! I’m going to be presenting something from the degree supervaluational paper at the AAP in July, and also on the agenda is to write up some ideas about the metaphysics of radical interpretation (as a kind of fictionalism about semantics) for the Fictionalism conference in Manchester this September.

[Update: I’ve added an extra section to the gradational accuracy paper, just showing that “coherent credences” for the various kinds of truth-value assignments I discuss satisfy the generalizations of classical probability theory suggested in Brian Weatherson’s 2003 NDJFL paper. The one exception is supervaluationism, where only a weakened version of the final axiom is satisfied—but in that case, we can show that the coherent credences must be Dempster-Shafer functions. So I think that gives us a pretty good handle on the behaviour of non-accuracy-dominated credences for the non-classical case.]

[Update 2: I’ve tightened up some of the initial material on non-classical semantics, and added something on intuitionism, which the generalization seems to cover quite nicely. I’m still thinking that kicking off the whole thing with lists of non-classical semantics ain’t the most digestable/helpful way of presenting the material, but at the moment I just want to make sure that the formal material works.]

Counting delineations

I presented my paper on indeterminacy and conditionals in Konstanz a few days ago. The basic question that paper poses is: if we are highly confident that a conditional is indeterminate, what sorts of confidence in the conditional itself are open to us?

Now, one treatment I’ve been interested in for a while is “degree supervaluationism”. The idea, from the point of view of the semantics, is to replace appeal to a single intended interpretation (with truth=truth at that interpretation) or set of “intended interpretations” (with truth=truth at all of them) with a measure over the set of interpretations (with truth to degree d = being true at exactly measure d of the interpretations). A natural suggestion, given that setting, is that if you know (/are certain) S is true to measure d, then your confidence in S should be d.

I’d been thinking of degree-supervaluationism in this sense, and the more standard set-of-intended-interpretations supervaluationism, as distinct options. But (thanks to Tim Williamson) I realize now that there may be an intermediate option.

Suppose that S= the number 6 is bleh. And we know that linguistic conventions settle that numbers <5 are bleh, and numbers >7 are not bleh. The available delineations of “nice”, among the integers, are ones where the first non-bleh number is 5, 6, 7 or 8. These will count as the “intended interpretations” for a standard supervaluational treatment, so “6 is bleh” will be indeterminate—in this context, neither true nor false.

I’ve discussed in the past several things we could say about rational confidence in this supervaluational setting. But one (descriptive) option I haven’t thought much about is to say that you should proportion your confidence to the number of delineations on which “6 is bleh” comes out true. In the present case, our confidence that 6 is bleh should be 0.5, our confidence that 5 is bleh should come out 0.25, and our confidence that 7 is bleh should come out 0.25.

Notice that this *isn’t* the same as degree-supervaluationism. For that just required some measure or other over the space of interpretations. And even if that was non-zero everywhere apart from ones which place first non-bleh number in 5-8, there are many options available. E.g. we might have a measure that assigns 0.9 to the interpretation which makes 5 the first non-bleh number, and distributes 0.3333… to the others. In other words, the degree-supervaluationist needn’t think that the measure is a measure *of the number of delineations*. I usually think of it (in the finite case), intuitively, as a measure of the “degree of intendedness” of each interpretation. In a sense, the degree-supervaluationists I was thinking of conceive of the measure as telling us to what extent usage and eligibility and other subvening facts favour one interpretation or another. But the kind of supervaluationists we’re now considering won’t buy into that at all.

I should mention that even if, descriptively, it’s clear what proposal here is, it’s less clear how the count-the-delineations supervaluationists would go about justifying the rule for assigning credences that I’m suggesting for them. Maybe the idea is that we should seek some kind of compromise between the credences that would be rational if we took D to be the unique intended interpretation, for each D in our set of “intended interpretations” (see this really interesting discussion of compromise for a model of what we might say—the bits at the end on mushy credence are particularly relevant). And they’ll be some oddities that this kind of theorist will have to adopt—e.g. for a range of cases, they’ll be assigning significant credence to sentences of the form “S and S isn’t true”. I find that odd, but I don’t think it blows the proposal out of the water.

Where might this be useful? Well, suppose you believe in B-theoretic branching time, and are going to “supervaluate” over the various future-branches (so “there will be a sea-battle” will a truth-value gap, since it is true on some but not all). (This approach originates with Thomason, and is still present, with tweaks, in recent relativistic semantics for branching time). “Branches” play the role of “interpretations”, in this setting. I’ve argued in previous work that this kind of indeterminacy about branching futures leads to trouble on certain natural “rejectionist” readings of what our attitudes to known indeterminate p should be. But a count-the-branches proposal seems pretty promising here. The idea is that we should proportion our credences in p to the *number* of branches on which p is true.

Of course, there are complicated issues here. Maybe there are just two qualitative possibilities for the future, R and S. We know R has a 2/3 chance of obtaining, and S a 1/3 chance of obtaining. In the B-theoretic branching setting, an R-branch will exist, and an S-branch will exist. Now, one model of the metaphysics at this point is that we don’t allow qualitatively duplicate future brnaches: so there are just two future-branches in existence, the R one and the S one. On a count-the-branches recipe, we’ll get the result that we should have 1/2 credence that R will obtain. But that conflicts with what the instruction to proportion our credences to the known chances would give us. Maybe R is primitively attached to a “weight” of 2/3—but our count-the-branches recipe didn’t say anything about that.

An alternative is that we multiply indiscernable futures. Maybe there are two, indiscernable R futures, and only one S future. Then apportioning  the credences in the way mentioned won’t get us into trouble. And in general, if we think whenever the chance (at moment m) that p is k, then the proportion of p-futures to non-p-futures is k, then  we’ll have a recipe that coheres nicely with the principal principle.

Let me be clear that I’m not suggesting that we identify chances with numbers-of-branches. Nor am I suggesting that we’ve got some easy route here for justifying the principal principle. The only thing I want to say is that *if* we’ve got a certain match between chances and numbers of future branches, then two recipes for assigning credences won’t conflict.

(I emphasized earlier that count-the-precisifications supervaluationism had less flexibility than degree-supervaluationism where the relevant measure was unconstrained by counting considerations. In a sense, what the above little discussion highlights is that when we move from “interpretations” to “branches” as the locus of supervaluational indeterminacy, this difference in flexibility evaporates. For in the case where that role is played by actually existing futures, then there’s at least the possibility of mutiplying qualitatively indiscernable futures. That sort of maneuver has little place in the original, intended-interpretations settings, since presumably we’ve got an independent fix on what the interpretations are, and we can’t simply postulate that the world gives us intended interpretations in proporitions that exactly match the credences we independently want to assign to the cases.)

Indeterminate survival: in draft

So, finally, I’ve got another draft prepared. This is a paper focussing on Bernard Williams’ concerns about how to think and feel about indeterminacy in questions of one’s own survival.

Suppose that you know that you know there’s an individual in the future who’s going to get harmed. Should you invest a small amount of money to alleviate the harm? Should you feel anxious about the harm?

Well, obviously if you care about the guy (or just have a modicum of humanity) you probably should. But if it was *you* that was going to suffer the harm, there’d be a particularly distinctive frisson. From a prudential point of view, you’d be compelled to invest minor funds for great benefit. And you really should have that distinctive first-personal phenomenology associated with anxiety on one’s own behalf. Both of these de se attitudes seem important features of our mental life and evaluations.

The puzzle I take from Williams is: are the distinctively first-personal feelings and expectations appropriate in a case where you know that it’s indeterminate whether you survive as the individual who’s going to suffer?

Williams thought that by reflecting on such questions, we could get an argument against account of personal identity that land us with indeterminate cases of survival. I’d like to play the case in a different direction. It seems to me pretty unavoidable that we’ll end up favouring accounts of personal identity that allow for indeterminate cases. So if , when you combine such cases with this or that theory of indeterminacy, you end up saying silly things, I want to take that as a blow to that account of indeterminacy.

It’s not knock-down (what is in philosophy?) but I do think that we can get leverage in this way against rejectionist treatments of indeterminacy, at least as applied to these kind of cases. Rejectionist treatments include those folks who think that characteristic attitudes to borderline cases includes primarily a rejection of the law of excluded middle; and (probably) those folks who think that in such cases we should reject bivalence, even if LEM itself is retained.

In any case, this is definitely something I’m looking for feedback/comments on (particularly on the material on how to think about rational constraints on emotions, which is rather new territory for me). So thoughts very welcome!

I’m quite tempted by the view that it is indeterminate that might be one of those fundamental, brute bits of machinery that goes into constructing the world. Imagine, for example, you’re tempted by the thought that in a strong sense the future is “open”, or “unfixed”. Now, maybe one could parlay that into something epistemic (lack of knowledge of what the future is to be), or semantic (indecision over which of the existing branching futures is “the future”) or maybe mere non-existence of the future would capture some of this unfixity thought. But I doubt it. (For discussion of what the openness of the future looks like from this perspective, see Ross and Elizabeth’s forthcoming Phil Studies piece).

The open future is far from the only case you might consider—I go through a range of possible arenas in which one might be friendly to a distinctively metaphysical kind of indeterminacy in this paper—and I think treating “indeterminacy” as a perfectly natural bit of kit is an attractive way to develop that. And, if you’re interested in some further elaboration and defence of this primitivist conception see this piece by Elizabeth and myself—and see also Dave Barnett’s rather different take on a similar idea in a forthcoming piece in AJP (watch out for the terminological clashes–Barnett wants to contrast his view with that of “indeterminists”. I think this is just a different way of deploying the terminology.)

I think everyone should pay more attention to primitivism. It’s a kind of “null” response to the request for an account of indeterminacy—and it’s always interesting to see why the null response is unavailable. I think we’ll learn a lot about what the compulsory questions the a theory of indeterminacy must answer, from seeing what goes wrong when the theory of indeterminacy is as minimal as you can get.

But here I want to try to formulate a certain kind of objection to primitivism about indeterminacy. Something like this has been floating around in the literature—and in conversations!—for a while (Williamson and Field, in particular, are obvious sources for it). I also think the objection if properly formulated would get at something important that lies behind the reaction of people who claim *just not to understand* what a metaphysical conception of indeterminacy would be. (If people know of references where this kind of idea is dealt with explicitly, then I’d be really glad to know about them).

The starting assumption is: saying “it’s an indeterminate case” is a legitimate answer to the query “is that thing red?”. Contrast the following. If someone asks “is that thing red?” and I say: it’s contingent whether it’s red”, then I haven’t made a legitimate conversational move. The information I’ve given is simply irrelevant to it’s actual redness.

So it’s a datum that indeterminacy-answers are in some way relevant to redness (or whatever) questions. And it’s not just that “it is indeterminate whether it is red” has “it is red” buried within it – so does the contingency “answer”, but it is patently irrelevant.

So what sort of relevance does it have? Here’s a brief survey of some answers:

(1) Epistemicist. “It’s indeterminate whether p” has the sort of relevance that answering “I don’t know whether p” has. Obviously it’s not directly relevant to the question of whether p, but at least expresses the inability to give a definitive answer.

(2) Rejectionist (like truth-value gap-ers, inc. certain supervaluationists, and LEM-deniers like Field, intuitionists). Answering “it’s indeterminate” communicates information which, if accepted, should lead you to reject both p, and not-p. So it’s clearly relevant, since it tells the inquirer what their attitudes to p itself should be.

(3) Degree theorist (whether degree-supervaluationist like Lewis, Edgington, or degree-functional person like Smith, Machina, etc). Answering “it’s indeterminate” communicates something like the information that p is half-true. And, at least on suitable elaborations of degree theory, we’ll then now how to shape our credences in p itself: we should have credence 0.5 in p if we have credence 1 that p is half true.

(4) Clarification request. (maybe some contextualists?) “it’s indeterminate that p” conveys that somehow the question is ill-posed, or inappropriate. It’s a way of responding whereby we refuse to answer the question as posed, but invite a reformulation. So we’re asking the person who asked “is it red?” to refine their question to something like “is it scarlet?” or “is it reddish?” or “is it at least not blue?” or “does it have wavelength less than such-and-such?”.

(For a while, I think, it was assumed that every series account of indeterminacy would say that if p was indeterminate, one couldn’t know p (think of parallel discussion of “minimal” conceptions of vagueness—see Patrick Greenough’s Mind paper). If that was right then (1) would be available to everybody. But I don’t think that that’s at all obvious — and in particular, I don’t think it’s obvious the primitivist would endorse it, and if they did, what grounds they would have for saying so).

There are two readings of the challenge we should pull apart. One is purely descriptive. What kind of relevance does indeterminacy have, on the primitivist view? The second is justificatory: why does it have that relevance? Both are relevant here, but the first is the most important. Consider the parallel case of chance. There we know what, descriptively, we want the relevance of “there’s a 20% chance that p” to be: someone learning this information should, ceteris paribus, fix their credence in p to 0.2. And there’s a real question about whether a metaphysical primitive account of chance can justify that story (that’s Lewis’s objection to a putative primitivist treatment of chance facts).

The justification challenge is important, and how exactly to formulate a reasonable challenge here will be a controversial matter. E.g. maybe route (4), above, might appeal to the primitivist. Fine—but why is that response the thing that indeterminacy-information should prompt? I can see the outlines of a story if e.g. we were contextualists. But I don’t see what the primitivist should say.

But the more pressing concern right now is that for the primitivist about indeterminacy, we don’t as yet have a helpful answer to the descriptive question. So we’re not even yet in a position to start engaging with the justificatory project. This is what I see as the source of some dissatisfaction with primitivism – the sense that as an account it somehow leaves something unimportant explained. Until the theorist has told me something more I’m at a loss about what to do with the information that p is indeterminate

Furthermore, at least in certain applications, one’s options on the descriptive question are constrained. Suppose, for example, that you want to say that the future is indeterminate. But you want to allow that one can rationally have different credences for different future events. So I can be 50/50 on whether the sea battle is going to happen tomorrow, and almost certain I’m not about to quantum tunnel through the floor. Clearly, then, nothing like (2) or (3) is going on, where one can read off strong constraints on strength of belief in p from the information that p is indeterminate. (1) doesn’t look like a terribly good model either—especially if you think we can sometimes have knowledge of future facts.

So if you think that the future is primitively unfixed, indeterminate, etc—and friends of mine do—I think (a) you owe a response to the descriptive challenge; (b) then we can start asking about possible justifications for what you say; (c) your choices for (a) are very constrained.

I want to finish up by addressing one response to the kind of questions I’ve been pressing. I ask: what is the relevance of answering “it’s indeterminate” to first-order questions? How should I alter my beliefs in receipt of the information, what does it tell me about the world or the epistemic state of my informant?

You might be tempted to say that your informant communicates, minimally, that it’s at best indeterminate whether she knows that p. Or you might try claiming that in such circumstances it’s indeterminate whether you *should* believe p (i.e. there’s no fact of the matter as to how you should shape your credences on the question of whether p). Arguably, you can derive these from the determinate truth of certain principles (determinacy, truth as the norm of belief, etc) plus a bit of logic. Now, that sort of thing sounds like progress at first glance – even if it doesn’t lay down a recipe for shaping my beliefs, it does sound like it says something relevant to the question of what to do with the information. But I’m not sure about that it really helps. After all, we could say exactly parallel things with the “contingency answer” to the redness question with which we began. Saying “it’s contingent that p” does entail that it’s contingent at best whether one knows that p, and contingent at best whether one should believe p. But that obviously doesn’t help vindicate contingency-answers to questions of whether p. So it seems that the kind of indeterminacy-involving elaborations just given, while they may be *true*, don’t really say all that much.

Chancy counterfactuals—three options

I was chatting to Rich Woodward earlier today about Jonathan Bennett‘s attitude to counterfactuals about chancy events. I thought I’d put down some of the thoughts I had arising from that conversation.

The basic thought is this. Suppose that on conditional that A were to happen, it would be overwhelmingly likely that B—but not probability 1 that B would occur. Take some cup I’m holding—if I were to drop it out the window, it’s overwhelmingly likely that it would fall to the floor and break, rather than shoot off sideways or quantum tunnel through the ground. But (we can suppose) there’s a non-zero—albeit miniscule—chance that the latter things would happen. (You don’t need to go all quantum to get this result—as Adam Elga and Barry Loewer have emphasized recently, if we have counterfactuals about macroevents, the probabilities involved in statistical mechanics also attribute tiny but nonzero probability to similarly odd things happening).

The question is, how should we evaluate the counterfactual “Drop>Break” taking into account the fact that given that Drop, there’d be a non-zero but tiny chance that ~Break?

Let’s take as our starting point a Lewisian account of of the counterfactual—“A>B” is to be true (at w) iff B is true at all the closest A-worlds to B. Then the worry many people have is that though the vast majority of closest possible Drop-worlds will be Break worlds, there’ll be a residual tiny minority of worlds where it won’t break—where quantum tunnelling or freaky statistical mechanical possibilities are realized. But since Lewis’s truth-conditions require that Break be true at *all* the closest Drop-worlds, even that tiny minority suffices to make the counterfactual “Drop>Break” false.

As goes “Drop>Break”, so goes almost every ordinary counterfactual you can think of. Almost every counterfactual would be false, if the sketch just given is right. Some people think that’s the right result. We’ll come back to it below.

Lewis’s own response is to deny that the freaky worlds are among the closest worlds. His idea is that freakiness (or as he calls it, the presence of “quasi-miracles”) itself is one of the factors that pushes worlds further away from actuality. That’s been recently criticised by John Hawthorne among others. I’m about to be in print defending a generally Lewisian line on these matters—though the details are different from Lewis’s and (I hope) less susceptible to counterexample.

But if you didn’t take that line, what should you say about the case? A tempting line of thought is to alter Lewis’s clause—requiring not truth at all the closest worlds but truth at most, or the overwhelming majority of them. (Of course, this idea presumes it makes sense to talk of relative proportions of worlds—let’s spot ourselves that).

This has a marked effect on the logic of counterfactuals—in particular, the agglomeration rule (A>B, A>C, therefore A>B&C) would have to go (Hawthorne points this out in his discussion, IIRC). To see how this could happen, suppose that there are 3 closest A-worlds, and X needs to be true at 2 of them in order for “A>X” to be true. Then let the worlds respectively be B&C, ~B&C, ~C&B-worlds. This produces a countermodel to agglomeration.

Agglomeration strikes me as a bad thing to give up. I’m not sure I have hugely compelling reasons for this, but it seems to me that a big part of the utility of counterfactuals lies in our being able to reason under a counterfactual supposition. Given agglomeration you can start by listing a bunch of counterfactual consequences (X, Y, Z), reason in standard ways (e.g. perhaps X, Y, Z entail Q) and then conclude that, under that counterfactual supposition, Q. This is essentially an inference of the following form:

1. A>X
2. A>Y
3. A>Z
4. X,Y,Z$\models$ Q

Therefore: A>Q.

And in general I think this should be generalized to arbitrarily many premises. If we have that, counterfactual reasoning seems secure.

But agglomeration is just a special case of this, where Q=X&Y&Z (more generally, the conjunction of the various consequents). So if you want to vindicate counterfactual reasoning of the style just mentioned, it seems agglomeration is going to be at the heart of it. I think giving some vindication of this pattern is non-negotiable. To be honest though, it’s not absolutely clear that making it logically valid is obviously required. You might instead try to break this apart into a fairly reliable but ampliative inference from A>X, A>Y, A>Z to A>X&Y&Z, and then appeal to this and the premise X&Y&Z$\models$ Q to reason logically to A>Q. So it’s far from a knock-down argument, but I still reckon it’s on to something. For example, anyone who wants to base a fictionalism on counterfactuals (were the fiction to be true then…) better take an interest in this sort of thing, since on it turns whether we can rely on multi-premise reasoning to preserve truth-according-to-the-fiction.

Jonathan Bennett is one who considers altering the truth clauses in the way just sketched (he calls it the “near miss” proposal–and points out a few tweaks that are needed to ensure e.g. that we don’t get failures of modus ponens). But he advances a second non-Lewisian way of dealing with the above clauses.

The idea is to abandon evaluations of counterfactuals being true or false, and simply assign them degrees of goodness. The degree of goodness of a counterfactual “A>B” is equal to the proportion of the closest A worlds that are B worlds.

There are at least two readings of this. One is that we ditch the idea of truth-evaluation of counterfactuals conditionals altogether, much as some have suggested we ditch truth-evaluation of indicatives. I take it that Edgington favours something like this, but it’s unclear whether that’s Bennett’s idea. The alternative is that we allow “strict truth” talk for counterfactuals, defined by a strict clause—truth at all the closest worlds—but then think that this strict requirement is never met, and so it’d be pointless to actually evaluate counterfactual utterances by reference to this strict requirement. Rather, we should evaluate them on the sliding scale given by the proportions. Really, this is a kind of error theory—but one supplemented by a substantive and interesting looking account of the assertibility conditions.

Both seem problematic to me. The main issue I have with the idea that we drop truth-talk altogether is the same issues I have with indicative conditionals—I don’t see how to deal with the great variety of embedded contexts in which we find the conditionals—conjunctions, other conditionals, attitude contexts, etc etc. That’s not going to impress someone who already believes in a probabilistic account of indicative conditionals, I guess, since they’ll have ready to hand a bunch of excuses, paraphrases, and tendancies to bite selected bullets. Really, I just don’t think this will wash—but, anyway, we know this debate.

The other thought is to stick with an unaltered Lewisian account, and accept an error theory. At first, that looks like an advance over the previous proposal, since there’s no problem in generalizing the truthconditional story about embedded contexts—we just take over the Lewis account wholesale. Now this is something of an advance of a brute error-theory, since we’ve got some positive guidance about the assertibility conditions for simple counterfactuals—they’re good to the extent that B is true in a high proportion of the closest A-worlds. And that will make paradigmatic ordinary counterfactuals like “Drop>Break” overwhelmingly good.

But really I’m not sure this is much of an advance over the Edgington-style picture. Because even though we’ve got a compositional story about truth-conditions, we don’t as yet have an idea about how to plausibily extend the idea of “degrees of goodness” beyond simple counterfactuals.

As an illustration, consider “If I were to own a china cup, then if I were to drop it out the window, it’d break”. Following simple-mindedly the original recipe in the context of this embedded conditional, we’d look for the proportion of closest owning worlds where the counterfactual “Drop>Break” is true. But because of the error-theoretic nature of the current proposal, at none (or incredibly few) of those worlds would the counterfactual be true. But that’s the wrong result—the conditional is highly assertible. So the simple-minded application of the orginal account goes wrong in this case.

Of course, what you might try to do is to identify the assertibility conditions of “Own>(Drop>Break)” with e.g. “(Own&Drop)>Break”—so reducing the problem of asseribility for this kind of embedding by way of paraphrase to one where the recipe gives plausible. But that’s to adopt the same kind of paraphrase-to-easy-cases strategy that Edgington likes, and if we’re going to have to do that all the time (including in hard cases, like attitude contexts and quantifiers) then I don’t see that a great deal of advance is made by allowing the truth-talk—and I’m just as sceptical as in the Edgington-style case that we’ll actually be able to get enough paraphrases to cover all the data.

There are other, systematic and speculative, approaches you might try. Maybe we should think of non-conditionals as having “degrees of goodness” of 1 or 0, and then quite generally think of the degree of goodness of “A>B” as the expected degree of goodness of B among the closest A-worlds—that is, we look at the closest A-worlds and the degree of goodness of B at each of these, and “average out” to get a single number we can associate with “A>B”. That’d help in the “Own>(Drop>Break)” case—in a sense, instead of looking at the expected truth value of “Drop>Break” among closest Own-worlds, we’d be looking at the expected goodness-value of “Drop>Break” among Own-worlds. (We’d also need to think about how degrees of goodness combine in the case of truth functional compounds of conditionals—and that’s not totally obvious. Jeffrey and Stalnaker have a paper on “Conditionals as Random Variables” which incorporates a proposal something like the above. IIRC, they develop it primarily in connection with indicatives to preserve the equation of conditional probability with the probability of the conditional. That last bit is no part of the ambition here, but in a sense, there’s a similar methodology in play. We’ve got an independent fix for associating degrees with simple conditionals—not the conditional subjective probability as in the indicative case—rather, the degree is fixed by the proportion of closest antecedent worlds where the (non-conditional) consequent holds. In any case, that’s where I’d start looking if I wanted to pursue this line).

Is this sort of idea best combined with the Edgington style “drop truth” line or the error-theoretic evaluation of conditionals? Neither, it seems to me. Just as previously, the compositional semantics based on “truth” seems to do no work at all—the truth value of compounds of conditionals will be simply irrelevant to their degrees of goodness. So it seems like a wheel spinning idly to postulate truth-values as well as these “Degrees of goodness”. But also, it doesn’t seem to me that the proposal fits very well with the spirit of Edgington’s “drop truth” line. For while we’re not running a compositional semantics on truth and falsity, we are running something that looks for all the world like a compositional semantics on degrees of goodness. Indeed, it’s pretty tempting to think of these “degrees of goodness” as degrees of truth—and think that what we’ve really done is replace binary truth-evaluation of counterfactuals with a certain style of degree-theoretic evaluation of them.

So I reckon that there are three reasonably stable approaches. (1) The Lewis-style approach where freaky worlds are further away then they’d otherwise be on account of their freakiness—where the Lewis-logic is maintained and ordinary counterfactuals are true in the familiar sense. (2) The “near miss” approach where logic is revised, ordinary counterfactuals are true in the familiar sense. (3) Then there’s the “degree of goodness” approach—which people might be tempted to think of in the guise of an error theory, or as an extension of the Adams/Edgington-style “no truth value” treatment of indicatives—but which I think will have to end up being something like a degree-theoretic semantics for conditionals, albeit of a somewhat unfamiliar sort.

I suggested earlier that an advantage of the Lewis approach over the “near miss” approach was that agglomeration formed a central part of inferential practice with conditionals. I think this is also an advantage that the Lewis account has over the degree-theoretic approach. How exactly to make this case isn’t clear, since it isn’t altogether obvious what the *logic* of the degree theoretic setting should be—but the crucial point is “A>X1″… “A>Xn” can all be good to a very high degree, while “A>X1&…&Xn” are good to a very low degree. Unless we restrict ourselves to starting points which are good to degree 1, then we’ll have to be wary of degradation of degree of goodness while reasoning under counterfactual suppositions, just as on the near miss proposal we’d have to be wary of degradation from truth to faslity. So the Lewisian approach I favour is, I think, the only one of the approaches currently on the table which makes classical reasoning under counterfactual suppositions fully secure.

Probabilities and indeterminacy

I’ve just learned that my paper “Vagueness, Conditionals and Probability” has been accepted for the first formal epistemology festival in Konstanz this summer. It looks like the perfect place for me to get feedback on, and generally learn more about, the issues raised in the paper. So I’m really looking forward to it.

I’m presently presenting some of this work as part of a series of talks at Arche in St Andrews. I’m learning lots here too! One thing that I’ve been thinking about today relates directly to the paper above.

One of the main things I’ve been thinking about is how credences, evidential probability and the like should dovetail with supervaluationism. I’ve written about this a couple of times in the past, so I’ll briefly set out one sort of approach that I’ve been interested in, and then sketch something that just occurred to me today.

The basic question is: what attitude should we take to p, if we are certain that p is indeterminate? Here’s one attractive line of thought. First of all, it’s a familiar thought that logic should impose some rationality constraints on belief. Let’s formulate this minimally as the constraint that, for the rational agent, probability (credence or evidential probability) can never decrease across a valid argument:

$A\models B \Rightarrow p(A)\leq p(B)$

Now take one of the things that supervaluational logics are often taken to imply, where ‘$D$‘ is read as ‘it is determinate that’:

$A\models DA$

Then we note that this and the logicality constraint on probabilities entails that

$p(A)\leq p(DA)$

So in particular, if we fully reject A being determinate (e.g. if we fully accept that it’s indeterminate) then the probability of the RHS will be zero, and so by the inequality, the probability of the RHS is zero. (The particular supervaluational consequence I’m appealing to is controversial, since it follows only in settings which seem inappropriate for modelling higher-order indeterminacy, but we can argue by adding a couple of extra assumptions for the same result in other ways. This’ll do us for now though).

The result is that if we’re fully confident that A is indeterminate, we should have probability zero in both A and in not-A. That’s interesting, since we’re clearly not in Kansas anymore—this result is incompatible with classical probability theory. Hartry Field has argued in the past for the virtues of this result as giving a fix on what indeterminacy is, and I’m inclined to think that it captures something at the heart of at least one way of conceiving of indeterminacy.

Rather than thinking about indeterminate propositions as having point-valued probabilities, one might instead favour a view whereby they get interval values. One version of this can be defined in this setting. For any A, let $u(A)$ be defined to be $1-p(\neg A)$. This quantity—how little one accepts the negation of a proposition—might be thought of as the upper bound of an interval whose lower bound is the probability of A itself. So rather than describe one’s doxastic attitudes to known indeterminate A as being “zero credence” in A, one might prefer the description of them as themselves indeterminate—in a range between zero and 1.

There’s a different way of thinking about supervaluational probabilities, though, which is in direct tension with the above. Start with the thought that at least for supervaluationism conceived as a theory of semantic indecision, there should be no problem with the idea of perfectly sharp classical probabilities defined over a space of possible worlds. The ways the world can be, for this supervaluationist, are each perfectly determinate, so there’s no grounds as yet for departing from orthodoxy.

But we also want to talk about the probabilities of what is expressed by sentences such as “that man is bald” where the terms involved are vague (pick your favourite example if this one won’t do). The supervaluationist thought is that this sentence picks out a sharp proposition only relative to a precisification. What shall we say of the probability of what this sentence expresses? Well, there’s no fact of the matter about what it expresses, but relative to each precisification, it expresses this or that sharp proposition—and in each case our underlying probability measure assigns it a probability.

Just as before, it looks like we have grounds for assigning to sentences, not point-like probability values, but range-like values. The range in question will be a subset of [0,1], and will consist of all the probability-values which some precisification of the claim acquires. Again, we might gloss this as saying that when A is indeterminate, it’s indeterminate what degree of belief we should have in A.

But the two recipes deliver totally utterly different results. Suppose, for example, I introduce a predicate into English, “Teads”, which has two precisifications: on one it applies to all and only coins which land Heads, on the other it applies to all and only coins that land Tails (or not Heads). Consider the claim that the fair coin I’ve just flipped will land Teads. Notice that we can be certain that this sentence will be indeterminate—whichever way the coin lands, Heads or Tails, the claim will be true on one precisification and false on the other.

What would the logic-based argument give us? Since we assign probability 1 to indeterminacy, it’ll say that we should assign probability 0, or a [0,1] interval, to the coin landing Teads.

What would the precisification-based argument give us? Think of the two propositions the claim might express: that the coin will land heads, or that the coin will land tails. Either way, it expresses a proposition that is probability 1/2. So the set of probability values associated with the sentence will be point-like, having value 1/2.

Of course, one might think that the point-like value stands in an interesting relationship to the [0,1] range—namely being its midpoint. But now consider cases where the coin is biased in one way. For example, if the coin is biased to degree 0.8 towards heads, then the story for the logic-based argument will remain the same. But for the precisification-based person the values will change to {0.8,0.2}. So we can’t just read off the values the precisificationist arrives at from what we get from the logic-based argument. Moral: in cases of indeterminacy, thinking of probabilities in the logic-based way wipes out all information other than that the claim in question is indeterminate.

This last observation can form the basis for criticism of supervaluationism in a range of circumstances in which we want to discriminate between attitudes towards equally indeterminate sentences. And *as an argument* I take it seriously. I do think there should be logical constraints on rational credence, and if the logic for supervaluationism is as its standardly taken to be, that enforces the result. If we don’t want the result, we need to argue for some other logic. Doing so isn’t cost free, I think—working within the supervaluational setting, bumps tend to arise elsewhere when one tries to do this. So the moral I’d like to draw from the above discussion is that there must be two very different ways of thinking about indeterminacy that both fall under the semantic indecision model. These two conceptions are manifest in different attitudes towards indeterminacy described above. (This has convinced me, against my own previous prejudices, that there’s something more-than-merely terminological to the question of “whether truth is supertruth”).

But let’s set that aside for now. What I want to do is just note that *within* the supervaluational setting that goes with the logic-based argument and thinks that all indeterminate claims should be rejected, there shouldn’t be any objection to the underlying probability measure mentioned above, and given this, one shouldn’t object to introducing various object-language operators. In particular, let’s consider the following definition:

“P(S)=n” is true on i, w iff the measure of {u: “S” is true on u,i}=n

But it’s pretty clear to see that the (super)truths about this operator will reflect the precisification-based probabilities described earlier. So even if the logic-based argument means that our degree of belief in indeterminate A should be zero, still there will be object-language claims we could read as “P(the coin will land Teads)=1/2” that will be supertrue. (The appropriate moral from the perspective of the theorist in question would be that whatever this operator expresses, it isn’t a notion that can be identified with degree of belief).

If this is right, then arguments that I’m interested in using against certain applications of the “certainty of indeterminacy entails credence zero” position have to be handled with extreme care. So, for example, in the paper mentioned right at the beginning of this post, I appeal to empirical data about folk judgements about the probabilities of conditionals. I was assuming that I could take this data as information on what the folk view about credences of conditionals is.

But if, compatibly with taking the “indeterminacy entails zero credence” view of conditionals, one could have within a language a P-operator which behaves in the ways described above, this isn’t so clear anymore. Explicit probability reports might be reporting on the P-operator, rather than subjective credence. So everything becomes rather delicate and very confusing.

The last few posts have discussed non-classical approaches to indeterminacy.

One of the big stumbling blocks about “folklore” non-classicism, for me, is the suggestion that contradictions (A&~A) be “half true” where A is indeterminate.

Here’s a way of putting a constraint that appeals to me: I’m inclined to think that an ideal agent ought to fully reject such contradictions.

(Actually, I’m not quite as unsympathetic to contradictions as this makes it sound. I’m interested in the dialethic/paraconsistent package. But in that setting, the right thing to say isn’t that A&~A is half-true, but that it’s true (and probably also false). Attitudinally, the ideal agent ought to fully accept it.)

Now the no-interpretation non-classicist has the resources to satisfy this constraint. She can maintain that the ideal degree of belief in A&~A is always 0. Given that:

p(A)+p(B)=p(AvB)+p(A&B)

we have:

p(A)+p(~A)=p(Av~A)

And now, whenever we fail to fully accept Av~A, it will follow that our credences in A and ~A don’t sum to one. That’s the price we pay for continuing to utterly reject contradictions.

The *natural* view in this setting, it seems to me, is that accepting indeterminacy of A corresponds to rejecting Av~A. So someone fully aware that A is indeterminate should fully reject Av~A. (Here and in the above I’m following Field’s “No fact of the matter” presentation of the nonclassicist).

But now consider the folklore nonclassicist, who does take talk of indeterminate propositions being “half true” (or more generally, degree-of-truth talk) seriously. This is the sort of position that the Smith paper cited in the last post explores. The idea there is that indeterminacy corresponds to half-truth, and fully informed ideal agents should set their partial beliefs to match the degree-of-truth of a proposition (e.g. in a 3-valued setting, an indeterminate A should be believed to degree 0.5). [NB: obviously partial beliefs aren’t going to behave like a probability function if truth-functional degrees of truth are taken as an “expert function” for them.]

Given the usual min/max take on how these multiple truth values get settled over conjunction and negation, for the fullyinformed agent we’ll get p(Av~A) set equal to the degree of truth of Av~A, i.e. 0.5. And exactly the same value will be given to A&~A. So contradictions, far from being rejected, are appropriately given the same doxastic attitude as I assign to “this fair coin will land heads”

Another way of putting this: the difference between our overall attitude to “the coin will land heads” and “Jim is bald and not bald” only comes out when we consider attitudes to contents in which these are embedded. For example, I fully disbelieve B&~B when B=the coin lands heads; but I half-accept it for B=A&~A . That doesn’t at all ameliorate the implausibility of the initial identification, for me, but it’s something to work with.

In short, the Field-like nonclassicist sets A&~A to 0; and that seems exactly right. Given this and one or two other principles, we get a picture where our confidence in Av~A can take any value—right down to 0; and as flagged before, the probabilities of A and ~A carve up this credence between them, so in the limit where Av~A has probability 0, they take probability 0 too.

But the folklore nonclassicist I’ve been considering, for whom degrees-of-truth are an expert function for degrees-of-belief, has 0.5 as a pivot. For the fully informed, Av~A always exceeds this by exactly the amount that A&~A falls below it—and where A is indeterminate, we assign them all probability 0.5.

As will be clear, I’m very much on the Fieldian side here (if I were to be a nonclassicist in the first place). It’d be interesting to know whether folklore nonclassicists do in general have a picture about partial beliefs that works as Smith describes. Consistently with taking semantics seriously, they might think of the probability of A as equal to the measure of the set of possibilities where A is perfectly true. And that will always make the probability of A&~A 0 (since it’s never perfectly true); and meet various other of the Fieldian descriptions of the case. What it does put pressure on is the assumption (more common in degree theorists than 3-value theorists perhaps) that we should describe degree-of-truth-0.5 as a way of being “half true”—why in a situation where we know A is halftrue, would we be compelled to fully reject it? So it does seem to me that the rhetoric of folklore degree theorists fits a lot better with Smith’s suggestions about how partial beliefs work. And I think it’s objectionable on that account.

[Just a quick update. First observation. To get a fix on the “pivot” view, think of the constraint being that P(A)+P(~A)=1. Then we can see that P(Av~A)=1-P(A&~A), which summarizes the result. Second observation. I mentioned above that something that treated the degrees of truth as an expert function “won’t behave like a probability function”. One reflection of that is that the logic-probability link will be violated, given certain choices for the logic. E.g. suppose we require valid arguments to preserve perfect truth (e.g. we’re working with the K3 logic). Then A&~A will be inconsistent. And, for example, P(A&~A) can be 0.5, while for some unrelated B, P(B) is 0. But in the logic A&~A|-B, so probability has decreased over a valid argument. Likewise if we’re preserving non-perfect-falsity (e.g. we’re working with the LP system). Av~A will then be a validity, but P(Av~A) can be 0.5, yet P(B) be 1. These are for the 3-valued case, but clearly that point generalizes to the analogous definitions of validity in a degree valued setting. One of the tricky things about thinking about the area is that there are lots of choice-points around, and one is the definition of validity. So, for example, one might demand that valid arguments preserve both perfect truth and non-perfect falsity; and then the two arguments above drop away since neither |-Av~A nor A&~A|- on this logic. The generalization to this in the many-valued setting is to demand e-truth preservation for every e. Clearly these logics are far more constrained than the K3 or LP logics, and so there’s a better chance of avoiding violations of the logic-probability link. Whether one gets away with it is another matter.]

Aristotelian indeterminacy and partial beliefs

I’ve just finished a first draft of the second paper of my research leave—title the same as this post. There’s a few different ways to think about this material, but since I hadn’t posted for a while I thought I’d write up something about how it connects with/arises from some earlier concerns of mine.

The paper I’m working on ends up with arguments against standard “Aristotelian” accounts of the open future, and standard supervaluational accounts of vague survival. But one starting point was an abstract question in the philosophy of logic: in what sense is standard supervaluationism supposed to be revisionary? So let’s start there.

The basic result—allegedly—is that while all classical tautologies are supervaluational tautologies, certain classical rules of inference (such as reductio, proof by cases, conditional proof, etc) fail in the supervaluational setting.

Now I’ve argued previously that one might plausibly evade even this basic form of revisionism (while sticking to the “global” consequence relation, which preserves traditional connections between logical consequence and truth-preservation). But I don’t think it’s crazy to think that global supervaluational consequence is in this sense revisionary. I just think that it requires an often-unacknowledged premise about what should count as a logical constant (in particular, whether “Definitely” counts as one). So for now let’s suppose that there are genuine counterexamples to conditional proof and the rest.

The standard move at this point is to declare this revisionism a problem for supervaluationists. Conditional proof, argument by cases: all these are theoretical descriptions of widespread, sensible and entrenched modes of reasoning. It is objectionably revisionary to give them up.

Of course some philosophers quite like logical revisionism, and would want to face-down the accusation that there’s anything wrong with such revisionism directly. But there’s a more subtle response available. One can admit that the letter of conditional proof, etc are given up, but the pieces of reasoning we normally call “instances of conditional proof” are all covered by supervaluationally valid inference principles. So there’s no piece of inferential practice that’s thrown into doubt by the revisionism of supervaluational consequence: it seems that all that happens is that the theoretical representation of that practice has to take a slightly more subtle form than one might except (but still quite a neat and elegant one).

One thing I mention in that earlier paper but don’t go into is a different way of drawing out consequences of logical revisionism. Forget inferential practice and the like. Another way in which logic connects with the rest of philosophy is in connection to probability (in the sense of rational credences, or Williamson’s epistemic probabilities, or whatever). As I sketched in a previous post, so long as you accept a basic probability-logic constraint, which says that the probability of a tautology should be 1, and the probability of a contradiction should be 0, then the revisionary supervaluational setting quickly forces you to a non-classical theory of probability: one that allows disjunctions to have probability 1 where each disjunct has probability 0. (Maybe we shouldn’t call such a thing “probability”: I take it that’s terminological).

Folk like Hartry Field have argued completely independently of this connection to Supervaluationism that this is the right and necessary way to handle probabilities in the context of indeterminacy. I’ve heard others say, and argue, that we want something closer to classicism (maybe tweaked to allow sets of probability functions, etc). And there are Dutch Book arguments to consider in favour of the classical setting (though I think the responses to these from the perspective of non-classical probabilities are quite convincing).

I’ve got the feeling the debate is at a stand-off, at least at this level of generality. I’m particularly unmoved by people swapping intuitions about degrees of belief it is appropriate to have in borderline cases of vague predicates, and the like (NB: I don’t think that Field ever argues from intuition like this, but others do). Sometimes introspection suggests intriguing things (for example, Schiffer makes the interesting suggestion that one’s degree of belief in a conjunction of two vague propositions is typically matches one’s degree of belief in the propositions themselves). But I can’t see any real dialectical force here. In my own case, I don’t have robust intuitions about these cases. And if I’m to go on testimonial evidence on others intuitions, it’s just too unclear what people are reporting on for me to feel comfortable taking their word for it. I’m worried, for example, they might just be reporting the phenomenological level of confidence they have in the proposition in question: surely that needn’t coincide with one’s degree of belief in the proposition (thinking of an exam you are highly nervous about, but are fairly certain you will pass… your behaviour may well manifest a high degree of belief, even in the absence of phenomenological trappings of confidence). In paradigm cases of indeterminacy, it’s hard to see how to do better than this.

However, I think in application to particular debates we might be able to make much more progress. Let us suppose that the topic for the day is the open future, construed, minimally, as the claim that while there are definite facts about the past and present, the future is indefinite.

Might we model this indefiniteness supervaluationally? Something like this idea (with possible futures playing the role of precisifications) is pretty widespread, perhaps orthodoxy (among friends of the open future). It’s a feature of MacFarlane’s relativistic take on the open future, for example. Even though he’s not a straightforward supervaluationist, he still has truth-value gaps, and he still treats them in a recognizably supervaluational-style way.

The link between supervaluational consequence and the revisionionary behaviour of partial beliefs should now kick in. For if you know with certainty that some P is neither true nor false, we can argue that you should invest no credence at all in P (or in its negation). Likewise, in a framework of evidential probabilities, P gets no evidential probability at all (nor does its negation).

But think what this says in the context of the open future. It’s open which way this fair coin lands: it could be heads, it could be tails. On the “Aristotelian” truth-value conception of this openness, we can know that “the coin will land heads” is gappy. So we should have credence 0 in it, and none of our evidence supports it.

But that’s just silly. This is pretty much a paradigmatic case where we know what partial belief we have and should have in the coin landing heads: one half. And our evidence gives exactly that too. No amount of fancy footwork and messing around with the technicalities of Dempster-Shafer theory leads to a sensible story here, as far as I can see. It’s just plainly the wrong result. (One doesn’t improve matters very much by relaxing the assumptions, e.g. taking the degree of belief in a failure of bivalence in such cases to fall short of one: you can still argue for a clearly incorrect degree of belief in the heads-proposition).

Where does that leave us? Well, you might reject the logic-probability link (I think that’d be a bad idea). Or you might try to argue that supervaluational consequence isn’t revisionary in any sense (I sketched one line of thought in support of this in the paper cited). You might give up on it being indeterminate which way the coin will land—i.e. deny the open future, a reasonably popular option. My own favoured reaction, in moods when I’m feeling sympathetic to the open future, is to go for a treatment of metaphysical indeterminacy where bivalence can continue to hold—my colleague Elizabeth Barnes has been advocating such a framework for a while, and it’s taken a long time for me to come round.

All of these reactions will concede the broader point—that at least in this case, we’ve got an independent grip on what the probabilities should be, and that gives us traction against the Supervaluationist.

I think there are other cases where we can find similar grounds for rejecting the structure of partial beliefs/evidential probabilities that supervaluational logic forces upon us. One is simply a case where empirical data on folk judgements has been collected—in connection with indicative conditions. I talk about this in some other work in progress here. Another which I talk about in the current paper, and which I’m particularly interested in, concerns cases of indeterminate survival. The considerations here are much more involved than in indeterminacy we find in connection to the open future or conditionals. But I think the case against the sort of partial beliefs supervaluationism induces can be made out.

All these results turn on very local issues. None, so far as see, generalizes to the case of paradigmatic borderline cases of baldness and the rest. I think that makes the arguments even more interesting: potentially, they can serve as a kind of diagnostic: this style of theory of indeterminacy is suitable over here; that theory over there. That’s a useful thing to have in one’s toolkit.

Degrees of belief and supervaluations

Suppose you’ve got an argument with one premise and one conclusion, and you think its valid. Call the premise p and the conclusion q. Plausibly, constraints on rational belief follow: in particular, you can’t rationally have a lesser degree of belief in q than you have in p.

The natural generalization of this to multi-premise cases is that if p1…pn|-q, then your degree of disbelief in q can’t rationally exceed the sum of your degrees of disbelief in the premises.

FWIW, there’s a natural generalization to the multi-conclusion case too (a multi-conclusion argument is valid, roughly, if the truth of all the premises secures the truth of at least one conclusion). If p1…pn|-q1…qm, then the sum of your degrees of disbelief in the conclusions can’t rationally exceed the sum of your degrees of disbelief in the premises.

What I’m interested in at the moment is to what extent this sort of connection can be extended to non-classical settings. In particular (and connected with the last post) I’m interested in what the supervaluationist should think about all this.

There’s a fundamental choice to be made at the get-go. Do we think that “degrees of belief” in sentences of a vague language can be represented by a standard classical probability function? Or do we need to be a bit more devious?

Let’s take a simple case. Construct the artificial predicate B(x), so that numbers less than 5 satisfy B, and numbers greater than5 fail to satisfy it. We’ll suppose that it is indeterminate whether 5 itself is B, and that supervaluationism gives the right way to model this.

First observation. It’s generally accepted that for the standard supervaluationist

p &~Det(p)|-absurdity;

Given this and the constraints on rational credence mentioned earlier, we’d have to conclude that my credence in B(5)&~Det(B(5)) must be 0. I have credence 0 in absurdity; and the degree of disbelief in the conclusion of this valid argument (1) must not exceed the degree of disbelief in its premises.

Let’s think that through. Notice that in this case, my credence in ~Det(B(5)) can be taken to be 1. So given minimal assumptions about the logic of credences, my credence in B(5) must be 0.

A parallel argument running from ~B(5)&~Det(~B(5))|-absurdity gives us that my credence in ~B(5) must be 0.

Moreover, supervaluational entails all classical tautologies. So in particular we have the validity: |-B(5)v~B(5). The standard constraint in this case tells us that rational credence in this disjunction must be 1. And so, we have a disjunction in which we have credence 1, each disjunct of which we have credence 0 in. (Compare the standard observation that supervaluational disjunctions can be non-prime: the disjunction can be true when neither disjunct is).

This is a fairly direct argument that something non-classical has to be going on with the probability calculus. One move at this point is to consider Shafer functions (which I know little about: but see here). Now maybe that works out nicely, maybe it doesn’t. But I find it kinda interesting that the little constraint on validity and credences gets us so quickly into a position where something like this is needed if the constraint is to work. It also gives us a recipe for arguing against standard supervaluationism: argue against the Shafer-function like behaviour in our degrees of belief, and you’ll ipso facto have an argument against supervaluationism. For this, the probablistic constraint on validity is needed (as far as I can see): for its this that makes the distinctive features mandatory.

I’d like to connect this to two other issues I’ve been working on. One is the paper on the logic of supervaluationism cited below. The key thing here is that it raises the prospect of p&~Dp|-absurdity not holding, even for your standard “truth=supertruth” supervaluationist. If that works, the key premise of the argument that forces you to have degree of belief 0 in both an indeterminate sentence ‘p’ and its negation goes missing.

Maybe we can replace it by some other argument. If you read “D” as “it is true that…” as the standard supervaluationist encourages you to, then “p&~Dp” should be read “p&it is not true that p”. And perhaps that sounds to you just like an analytic falsity (it sure sounded to me that way); and analytic falsities are the sorts of things one should paradigmatically have degree of belief 0 in.

But here’s another observation that might give you pause (I owe this point to discussions with Peter Simons and John Hawthorne). Suppose p is indeterminate. Then we have ~Dp&~D~p. And given supervaluationism’s conservativism, we also have pv~p. So by a bit of jiggery-pokery, we’ll get (p&~Dp v ~p&~D~p). But in moods where I’m hyped up thinking that “p&~Dp” is analytically false and terrible, I’m equally worried by this disjunction. But that suggests that the source of my intuitive repulsion here isn’t the sort of thing that the standard supervaluationist should be buying. Of course, the friend of Shafer functions could just say that this is another case where our credence in the disjunction is 1 while our credences in each disjunct is 0. That seems dialectically stable to me: after all, they’ll have *independent* reason for thinking that p&~Dp should have credence 0. All I want to insist is that the “it sounds really terrible” reason for assigning p&~Dp credence 0 looks like it overgeneralizes, and so should be distrusted.

I also think that if we set aside truth-talk, there’s some plausibility in the claim that “p&~Dp” should get non-zero credence. Suppose you’re initially in a mindset where you should be about half-confident of a borderline case. Well, one thing that you absolutely want to say about borderline cases is that they’re neither true nor false. So why shouldn’t you be at least half-confident in the combination of these?

And yet, and yet… there’s the fundamental implausibility of “p&it’s not true that p” (the standard supervaluationist’s reading of “p&~Dp”) having anything other than credence 0. But ex hypothesi, we’ve lost the standard positive argument for that claim. So we’re left, I think, with the bare intuition. But it’s a powerful one, and something needs to be said about it.

Two defensive maneuvers for the standard supervaluationist:

(1) Say that what you’re committed to is just “p& it’s not supertrue that p”. Deny that the ordinary concept of truth can be identified with supertruth (something that as many have emphasized, is anyway quite plausible given the non-disquotational nature of supertruth). But crucially, don’t seek to replace this with some other gloss on supertruth: just say that supertruth, superfalsity and gap between them are appropriate successor concepts, and that ordinary truth-talk is appropriate only when we’re ignoring the possibility of the third case. If we disclaim conceptual analysis in this way, then it won’t be appropriate to appeal to intuitions about the English word “true” to kick away independently motivated theoretical claims about supertruth. In particular, we can’t appeal to intuitions to argue that “p&~supertrue that p” should be assigned credence 0. (There’s a question of whether this should be seen as an error-theory about English “truth”-ascriptions. I don’t see it needs to be. It might be that the English word “true” latches on to supertruth because supertruth what best fits the truth-role. On this model, “true” stands to supertruth as “de-phlogistonated air” according to some, stands to oxygen. And so this is still a “truth=supertruth” standard supervaluationism.)

(2) The second maneuver is to appeal to supervaluational degrees of truth. Let the degree of supertruth of S be, roughly, the measure of the precisifications on which S is true. S is supertrue simpliciter when it is true on all the precisifications, i.e. measure 1 of the precisifications. If we then identify degrees of supertruth with degrees of truth, the contention that truth is supertruth becomes something that many find independently attractive: that in the context of a degree theory, truth simpliciter should be identified with truth to degree 1. (I think that this tendancy has something deeply in common with the temptation (following Unger) to think that nothing that nothing can be flatter than a flat thing: nothing can be truer than a true thing. I’ve heard people claim that Unger was right to think that a certain class of adjectives in English work this way).

I think when we understand the supertruth=truth claim in that way, the idea that “p&~true that p” should be something in which we should always have degree of belief 0 loses much of its appeal. After all, compatibly with “p” not being absolutely perfectly true (=true), it might be something that’s almost absolutely perfectly true. And it doesn’t sound bad or uncomfortable to me to think that one should conform one’s credences to the known degree of truth: indeed, that seems to be a natural generalization of the sort of thing that originally motivated our worries.

In summary. If you’re a supervaluationist who takes the orthodox line on supervaluational logic, then it looks like there’s a strong case for a non-classical take on what degrees of belief look like. That’s a potentially vulnerable point for the theory. If you’re a (standard, global, truth=supertruth) supervaluationist who’s open to the sort of position I sketch in the paper below, prima facie we can run with a classical take on degrees of belief.

Let me finish off by mentioning a connection between all this and some material on probability and conditionals I’ve been working on recently. I think a pretty strong case can be constructed for thinking that for some conditional sentences S, we should be all-but-certain that S&~DS. But that’s exactly of the form that we’ve been talking about throughout: and here we’ve got *independent* motivation to think that this should be high-probability, not probability zero.

Now, one reaction is to take this as evidence that “D” shouldn’t be understood along standard supervaluationist lines. That was my first reaction too (in fact, I couldn’t see how anyone but the epistemicist could deal with such cases). But now I’m thinking that this may be too hasty. What seems right is that (a) the standard supervaluationist with the Shafer-esque treatment of credences can’t deal with this case. But (b) the standard supervaluationist articulated in one of the ways just sketched shouldn’t think there’s an incompatibility here.

My own preference is to go for the degrees-of-truth explication of all this. Perhaps, once we’ve bought into that, the “truth=degree 1 supertruth” element starts to look less important, and we’ll find other useful things to do with supervaluational degrees of truth (a la Kamp, Lewis, Edgington). But I think the “phlogiston” model of supertruth is just about stable too.

[P.S. Thanks to Daniel Elstein, for a paper today at the CMM seminar which started me thinking again about all this.]