How might we measure the proximity or similarity of two belief states? Suppose they are represented in each case as a function from propositions to real numbers between 0 and 1, representing their respective degrees of belief. Is it possible to find a sensible and formally tractable measure of how similar these two states are?
How might we measure the proximity or similarity of two desire states? Suppose they are represented in each case as a function from propositions to real numbers, representing how desirable the agent finds each proposition being true. Is it possible to find a sensible and formally tractable measure of how similar these two states are?
The TL;DR of what follows is: I think we can find a measure of both (or better, a measure of the proximity of pairs of combined belief-desire states). And this idea of proximity between belief-desire psychologies is key to explaining the force of theoretical rationality constraints (probabilism) and means-end practical rationality constraints (causal expected utility theory). Furthermore, it’s the notion we need to articulate the role of “principles of charity” in metasemantics.
The first question above is one that has arisen prominently in accuracy-first formal epistemology. As the name suggests, the starting point of that project is a measure of the accuracy of belief states. Richard Pettigrew glosses the accuracy of a credence function at a world as its “proximity to the ideal credence at that world” (Accuracy and the laws of credence, p.47). If you buy Pettigrew’s main arguments for features of belief-proximity in chapter 4 of this book, then it’s a mathematical consequence that belief-proximity is what’s known as an “additive Bregman divergence”, and if you in addition think that the distance from belief b to belief b* is always the same as the distance from belief b* to belief b (i.e. proximity is symmetric) then one can prove, essentially, that the right way to measure the proximity of belief states Alpha and Beta is by taking the “squared Euclidean distance”, i.e. to take each proposition, take the difference between the real number representing Alpha’s credence in it and that representing Beta’s credence in it, take the square of this difference, and sum up the results over all propositions.
Now, once you have this measure of proximity to play with, accuracy-firsters like Pettigrew can put it to work in their arguments for the rational constraints on belief. Accuracy of a belief in w is proximity to the ideal belief state in w; if the ideal belief state for an agent x in w is one that matches the truth values of each proposition (“veritism”) then one can extract from the measure of proximity a measure of accuracy, and go on to prove, for example, that a non-probabilistic belief state b will be “accuracy dominated”, i.e. there will be some alternative belief state b* which is *necessarily* more accurate than it.
So far, so familiar. I like this way of relating theoretical rational constraints like probabilism to what’s ultimately valuable in belief–truth. But I’m also interested in notion of proximity for other reasons. In particular, when working in metasemantics, I want to think about principles of interpretation that take the following shape:
(I) On the basis of the interpreter’s knowledge of some primary data, and given constraints that tie possible belief states to features of that primary data, the interpreter is in a position to know that the target of interpretation has a belief state within a set C.
(II) The interpreter attributes to the target of interpretation that belief state within C which is closest to belief state m.
To fix ideas: the set C in (I) might arise out of a process of finding a probability-utility pair which rationalizes the target’s choice behaviour (i.e. always makes the option the target chooses the one which maximizes expected utility, by their lights, among the options they choose between). The magnetic belief state “m” in (II) might be the ideal belief state to have, by the interpreter’s lights, given what they know about the target’s evidential setting. Or it might be the belief state the interpreter would have in the target’s evidential setting.
There are lots of refinements we might want to add (allowing m to be non-unique, catering for situations in which there are several elements in C that are tied for closeness to m). We might want to clarify whether (I) and (II) are principles of practical interpretation, somehow mapping the processes or proper outputs of a real-life flesh and blood interpreter, or whether this is intended as a bit of theory of ideal interpretation, carried out on the basis of total “knowledge” of primary facts about the target. But I’ll set all that aside.
The thing I want to highlight is that step (II) of the process above makes essential use of a proximity measure. And it’s pretty plausible that we’re here shopping in the same aisle as the accuracy-first theorists. After all, a truth-maximizing conception of principles of interpretation would naturally want to construe (II) as attributing to the subject the most accurate belief state within the set C, and we’ll get that if we set the “ideal” credence (in a given world) to be the credal state that matches the truth values at that world, in line with Pettigrew’s veritism, and understand proximity in the way Pettigrew encourages us to. Pettigrew in fact defends his characterization of proximity independently of any particular identification of what the ideal credences are. If you were convinced by Pettigrew’s discussion, then even if the “ideal credence” m for the purposes of interpretation is different from the “ideal credence” for the purposes of the most fundamental doxastic evaluation, you’ll still think that the measure of proximity—additive Bregman divergence/squared Euclidean distance–is relevant in both cases.
That’s the end of the (present) discussion as far as belief goes. I want to turn to an extension to this picture that becomes pressing when we think of this in the context of principles of interpretation. For in the implementations that I am most interested in, what we get out of step (I) is not a set of belief states alone, but a set of belief-desire psychologies—a pairing of credence and utility functions, for example. Now, it’s possible that the second step of interpretation, (II), cares only about what goes on with belief—picking the belief-desire psychologies whose belief component is closest to the truth, to the evidence, or to the belief state component of some other relevantly magnetic psychological state. But the more natural version of this picture wouldn’t simply forget about the desires that are also being attributed. And if it is proximity between belief-desire psychologies in C and magnetic belief-desire psychology m that is at issue, we are appealing to a proximity not between belief states alone, but proximity between pairs of belief-desire states.
If desire states are represented by a function from propositions to real numbers (degrees of desirability) then there’s clearly a straight formal extension of the above method available to us. If we used squared euclidean distance as a measure of the proximity or similarity of a pair of belief states, use exactly the same formula for measuring the proximity of similarity of desire! But Pettigrew’s arguments for the characteristics which select that measure do not all go over. In particular, Pettgrew’s most extended discussion is in defence of a “decomposition” assumption that makes essential use of notions (e.g. “well-calibrated counterpart of the belief state”) that do not have any obvious analogue for belief-desire psychologies.
Is there anything to be said for the squared euclidean distance measure of proximity between belief-desire psychologies, in the absence of an analogue of what Pettigrew says in the special case of proximity of belief states? Well, one thing we can note is that as it extends Pettigrew’s measure of the proximity of belief states, it’s consistent with it–a straight generalization is the natural first hypothesis for belief-desire proximity to try, relative to the Pettigrew starting point. What I want to now discuss is a way of getting indirect support for it. What I’ll argue is that it can do work for us analogous to the work that it does for the probabilist in accuracy framework.
To get the inaccuracy framework off the ground, recall, Pettigrew commits to the identification of an ideal belief state at each world. The ideal belief state at w is that belief state whose levels of confidence in each proposition matches the truth value of that proposition at w (1 if the proposition is true, 0 if the proposition is false). To add something similar for desire, instead of truth values, let’s start from a fundamental value function defined over the worlds, V, measured by a real number. You can think of the fundamental valuation relation as fixed objectively (the objective goodness of the world in question), fixed objectively relative to an agent (the objective goodness of the world for that agent), as a projection of values embraced by the particular agent, or some kind of mix of the above. Pick your favourite and we’ll move on.
I say: the ideal degree of desirability for our agent to attach to the proposition that w is the case is V(w). But what is the ideal degree of desirability for other, less committal propositions? Here’s a very natural thought (I’ll come back to alternatives later): look at which world would come about were p the case, and the ideal desirability of p is just V(w) for that w which p counterfactually implies. (This, by the way, is a proposal that makes heavy reliance on the Stalnakerian idea that for every world there is a unique closest world where p). So we extend V(w), defined over worlds, to V*(p), defined over propositions, via counterfactual connections of this kind.
If we have this conception of ideal desires, and also the squared-euclidean measure of proximity between desire states, then a notion of “distance from the ideal desire state” drops out. Call this measure the misalignment of a desire state. If we have the squared-euclidean masure of proximity between combined belief-desire states, then what drops out is a notion of “distance from the ideal belief-desire state”, which is simply the sum of the inaccuracy of its belief component and the misalignment of its desire component.
The fundamental result that accuracy-firsters point to as a vindication of probabilism is this: unless a belief state b is a convex combination of truth values (i.e. a probability function) then there will be a b* which is necessarily more accurate than b. In this setting, the same underlying result (as far as I kind see—there are a few nice details about finitude and boundedness to sort out) delivers this: unless belief-desire state <b,d> is a convex combination over w of vectors of the form <truth-value at w, V-value at w>, then will be some alternative psychology <b*,d*> which will necessarily be closer to the ideal psychology (more accurate-and-aligned) than is <b,d>.
What must undominated belief-desire psychologies be like? We know they must be convex combinations of <truth value at w, V-value at w> pairs for varying w. The b component will then be a convex combination of truth values with weights k(w), i.e. a probability function that invests credence k(w) in w. More generally, both the b and d components are expectations of random variables with weights k(w). b(p) will be the expectation of indicator random variables for proposition p, and d(p) the expectation of the value-of-p random variable. The expectation of the value-of-p random variable turns out to be equal to the sum over all possible values of k of the following: k multiplied by the agent’s degree of belief of the counterfactual conditional if p had been the case, then value of the world would be k. And that, in effect, is Gibbard-Harper’s version of causal decision theory.
If the sketch above is correct, then measuring proximity of whole psychologies by squared euclidean distance (or more generally, an additive Bregman divergence), will afford a combined accuracy-domination argument for probabilism and value-domination argument for causal decision theory. That’s nice!
Notice that there’s some obvious modularity in the argument. I already noted that we could treat V(w) as objective, relativized or subjective value. Further, we get the particular Gibbard-Harper form of causal decision theory because we extended the ideal V over worlds to ideal V* over propositions via counterfactual conditionals concerning which world would obtain if p were the case. If instead defined the ideal V* in p as the weighted average of the values of worlds, weighted by the conditional chance of that world obtaining given p, then we’d end up with an expected-chance formulation of causal decision theory. If we defined the ideal V* in p via combinations of counterfactuals about chance, we would derive a Lewisian formulation of causal decision theory. If we reinterpret the conditional in the formulation given above as an indicative conditional, then we get a variant of evidential decision theory, coinciding with Jeffrey’s expected decision theory only if the probability of the relevant conditional is always equal to the corresponding conditional probability (that thesis, though, is famously problematic).
Okay, so let’s sum up. What has happened here is that, for the purposes of formulating a particular kind of metasemantics of belief and desire, we need a notion of proximity of whole belief-desire psychologies for one another. Now, Pettigrew has explicitly argued for specific way of measuring proximity for the belief side of the psychology. The natural thing to do is to extend his arguments to descriptions of psychological states including desires as well as beliefs. But unfortunately, his arguments for that specific way of handling proximity look too tied to belief. However, we can provide a more indirect abductive argument for the straight generalization of this way of measuring proxmity over belief-desire psychologies by (a) endorsing Pettigrew’s arguments for the special case of belief; and (b) noting that the straight generalization of this would provide a uniform vindication of both probabilism and standard rational requirements on desire as well as belief.
This, at least, makes me feel that I should be pretty comfortable at appealing to a notion of “proximity to the magnetic belief-desire state m” in formulating metasemantics in the style above, and measuring this by squared Euclidean distance—at least insofar as I am bought in to the conception of accuracy that Pettigrew sketches.
Let me make one final note. I’ve been talking throughout as if we all understood what real-valued “degrees of desire” are. And the truth is, I believe I do understand this. I think that I have neutral desire for some worlds/propositions, positive desire for others, negative for a third. I think that we can measure and compare the gap between the desirability of two propositions—the difference between the desirability of eating cake and eating mud is much greater than the difference between the the desirability of overnight oats and porridge. I think there are facts of the matter about whether you and I desire the same proposition equally, or whether I desire it more than you, or you desire it more than me.
But famously, some are baffled by interpersonal comparisons of utility, or features of the utility-scale, of the kind I instinctively like. If you think of attributing utility as all about finding representations that vindicate choice behaviour, interpersonal comparisons will be as weird as the idea of an interpersonal choice. The whole project of measuring proximity between desireability states via functions on their representations as real values might look like a weird starting point. If you google the literature on similarity measures for utility, you’ll find a lot of work on similarity of preference orderings e.g. by counting how many reversals of the orderings it takes to turn one into another. You might think this is a much less controversial starting point than what I’m doing, and that I need to do a whole heap more work to earn the right to my starting point.
I think the boot is on the other foot. The mental metasemantics in which I aim to deploy this notion of proximity denies that all there is to attributing utility is to find a representation that vindicates the agent’s choice behaviour. That’s step I, but step II goes beyond this to play favourites among the set of vindicatory psychological states. By the same token, the mental metasemantics sketched grounds interpersonal comparisons of desirability between various agents, by way of facts about the proximity of the desirability of the agent’s psychology to the magnetic psychological state m.
There’s a kind of dialectical stalemate here. If interpersonal comparisons are a busted flush, the prospects look dim for any kind of proximity measure of the kind I’m after here (i.e. one that extends the proximity implicit in accuracy-first framework). If however, the kind of proximity measures I’ve been discussing make sense, then we can use them to ground the real-value representations of agent’s psychological states that make possible interpersonal comparisons. I don’t think either myself or my more traditional operationalizing opponent here should be throwing shade at the other at this stage of development–rather, each should be allowed develop their overall account of rational psychology, and at the end of the process we an come back and compare notes about whose starting assumptions were ultimately more fruitful.