Archive for the ‘An introduction to decision theory’ Category

Causal decision theory

This is part of the sequence, “An introduction to decision theory”. It is not designed to make sense as  a stand alone post.

The Autoverse was a ‘toy’ universe, a computer model which obeyed its own simplified ‘laws of physics’ – laws far easier to deal with mathematically than the equations of real-world quantum mechanics. Atoms could exist in this stylized universe, but they were subtly different from their real-world counterparts; the Autoverse was no more a faithful simulation of the real world than the game of chess was a faithful simulation of medieval warfare. It was far more insidious than chess, though, in the eyes of many real world chemists. The false chemistry it supported was too rich, too complex, too seductive by far.

Greg Egan, Permutation City

Imagine you’ve created a simulation rich enough that evolution can take place within it. But the environment is not a normal environment – it’s an environment designed to evolve new decision theories. The selective pressures are different decision scenarios which exist in different combinations in different parts of the world. On Earth, selective advantage comes about through the ability to reproduce. What could fill the same role in this simulation? The traditional answer in philosophy is rationality. Decision theory is intended to be a theory not of how an agent should decide so as to thrive but so as to be rational. Whether there should be a distinction between the two and how we should think about rationality will be the topic of a future section of this sequence but in this section I intend to discuss the standard debate of decision theory. The background.

So what determines rationality? Well, traditional philosophy maycome up with criteria but they’re happy to abandon them if they seem to fail to capture rationality in some circumstance. So that means that underlying the explicit criteria is an implicit appeal to intuition. We will know rationality when we see it (it may not come as a surpise then that which decision is considered to be rational is debated for many decision scenarios).

These are issues for later though. For now, imagine that our intuitions are more coherent and have been fed to the simulation. An agent that decides rationally finds that they are more likely to reproduce.

Evolution can’t survive without variation and mutation. So what is varying and mutating in this simulation? Take the original formula we discussed for expected utility:

Expected \ Utility (Decision) =\sum_{i}Probability(WorldState_{i})\times Utility(WorldState_{i}\ \wedge \ Decision )

The utility section of this formula is fairly uncontroversial. Different decision theories involves different probabilities playing a role in this formula. So that is what will vary and mutate in the simulation: the probability term used here – different probabilities will be used which capture different relationships between the decision and the world state.

The simulation will start with naive decision theory which, having at least some coherence, will survive and spread widely enough for mutations to begin to develop. Many of these will be evolutionary dead ends. Maybe one will refer not to the probability of the world state alone but instead the probability of the decision alone – the formula will now ask what the probability is that the agent will decide a certain way. But the formula is designed to determine the decision the agent makes so each time it seems to reach a decision a new probability will be determined for the decision and the formula will have to be recalculated. Even if it eventually finds an equilibrium point, there’s no reason to believe that this point will identify the rational decision. Such mutations will rapidly die out.

Others will be more useful. Eventually, the evidential decision theory that we discussed in the last post will come into being. Presuming the environment it develops in is one which includes decision scenarios where the world state depends on the decision, it will take over from naive decision theory in these areas.

Elsewhere, however, another decision theory will evolve to deal with this same problem: causal decision theory. While evidential decision theory made use of conditional probabilities, causal decision theory makes you of the probability of a particular type of conditionals. In other words, it’s probability term looks something like this (traditionally a box with an arrow coming out of it represents the relationship but WordPress doesn’t render that symbol so I’m using the standard arrow):

Probability(Decision \rightarrow WorldState_{i})

This represents the probability of the subjunctive conditional relating the decision to the world state. A subjunctive conditional can also be thought of as a counterfactual conditional capturing the statement “If I were to make this decision then the world would be in this state”. Causal decision theory then considers the probability of this statement being true.

Another way of thinking of this is that while evidential decision theory asks how the decision and the world state are correlated, causal decision theory is interested in only causation between the decision and the world state. This will also resolve the difficulties faced by naive decision theory. If an agent is deciding whether to enter an air raid shelter during a bombing raid, they will note that doing so will cause them to be more likely to survive and so they will take shelter. If naive decision theory, on the other hand, divides the world into the states where it survives and where it dies it will fail to realise that it’s decision can change the probabilities of the world being each of these ways. So it won’t bother with the hassle of sheltering.

So causal decision theory also begins to establish a foothold in the world. The next post will ask the question: what happens when causal and evidential decision theory meet in the same area of the simulation?

Evidential decision theory

This is part of the sequence, “An introduction to decision theory”. It is not designed to make sense as  a stand alone post.

You have rebuilt the artificial creature and now you’ve placed it back in the false environment. For a while, it sits and simply gathers data about the world in the form of probabilistic relationships. It observes ten creatures going to a substantial patch of food, eight of them get picked off by predators. It observes ten creatures going to a less substantial but more sheltered patch of food. None of them get eaten. It begins to form opinions about probabilities relating the predators and the two patches.

Eventually though, it grows hungry and has to choose which patch to go to. The previous incarnation of the creature just divided the world into the possibilities it survived and the possibilities it didn’t, without taking into account the effect of its decisions on its chances of survival. This new version of the creature will not fall into the same trap.

There are two ways it could have been programmed to avoid this. It could have been given a causal decision theory so it would ask  what its actions were likely to cause (would going to this patch of food cause me to be more likely to die). Causation is difficult though. The creature watches and takes in probabilistic information but this isn’t enough – there are many occasions where purely probabilistic information isn’t enough to identify a single causal structure for the world. Add in temporal information and more cases can be distinguished. However, an unobserved third variable might be responsible for the probabilistic relationships and even temporality won’t always help to distinguish this.

Causation is difficult and if instead you could read off useful information directly from just the probabilistic information, surely that would be preferred. Correlation, it turns out, is much easier to figure out. And correlation is what fuels evidential decision theory (EDT).  Naive evidential decision theory (we’ll discuss more sophisticated versions in later posts) simply looks at the correlations between the decision and the state of the world. It seems that its simple nature may give it an initial benefit as a decision theory – if evidential decision theory can do all that causal decision theory can, then, the argument goes, it wins because it has less conceptual baggage.

So how does EDT capture correlation. Let’s look at the original equation for expected utility:

Expected \ Utility (Decision) =\sum_{i}Probability(WorldState_{i})\times Utility(WorldState_{i}\ \wedge \ Decision )

In the previous post, we realised that the probability of the world state can’t be treated as being independent of the decision. In other words, the following term of the equation needs to be changed.


Evidential decision theory does this by replacing this with the probability of the world state given the decision – P(A \ \mid  \ B) means the probability of A given B.

Probability(WorldState_{i} \ \mid \ Decision)

So evidential decision theory calls for an agent to make the decision which maximises the following formula:

Expected \ Utility (Decision) =\sum_{i}(WorldState_{i} \ \mid \ Decision)\times Utility(WorldState_{i}\ \wedge \ Decision )

How does this work? Well think of the patches of food. Given that the relationship used in this equation is simple correlation, the agent can easily work out that the probability of being eaten given the decision to go to the more substantial patch is much higher than the probability of being eaten given the choice to go to the less substantial patch. So it will choose to go to the less substantial patch.

To put the numbers in, we need some probabilities. Presuming the agent drew these from its earlier observations, then the probability of death given the substantial patch is 0.8 (8 out of ten creatures were eaten). On the other hand, its probability of being eaten given going to the less substantial patch is 0.

However, while we’ve been focusing on the probabilities in the last few posts (as these are the issue of the debate we’re exploring), the equation above does also mention another factor – the utility received from the combination of a world state and a decision. In a previous post we represented that in this utility table:

Death Survival
Substantial patch -10 5
Less substantial patch -10 2

Now we have all the information we need to do the necessary calculations (click on the image for a larger copy)

So evidential decision theory reaches the correct decision in the case facing our agent and it does so without relying on any complex causal apparatus. The next few posts will explore how causal decision theory reaches the same decision in at least this instance. The question that will then be asked is, does causal decision theory have advantages such that taking on the extra causal baggage is worthwhile?

A problem with naive decision theory

This is part of the sequence, “An introduction to decision theory”. It is not designed to make sense as  a stand alone post.

You watch the creature hesitate, processing the best data it can gather about the environment around it. You hold your breath. The last ten years of your life have been spent as part of the team programming this artificial creature – so simple on the face of it, yet it took such a level of complexity to equal even this simple achievement of evolution. The creature is now face with its first decision: one patch of food is more tempting but less sheltered from predators. Another is less tempting but well sheltered. In this setup, predators are so prevalent that the sensible decision is to go for the more sheltered food.

Finally, the agent makes its decision. Seconds later it is caught by a predator. You sigh and download the log. Time to figure out what went wrong. You see it straight away. Before the creature could make its decision, it first needed to build up a utility table. You had expected it to build up the following table.

Predators present Predators absent
Substantial patch -10 5
Less substantial patch 2 2

Instead, it had built up the following utility table:

Death Survival
Substantial patch -10 5
Less substantial patch -10 2

There’s nothing wrong with this second utility table. Dying is always worth -10. Dying in either patch of food is worth the same disutility. Survival is indeed worth more in the substantial patch of food because it’s a preferable location. However, the point is that the creature is more likely to survive if it heads for the less substantial patch. In other words, the probability of the world state depends on the decision.

However, if we look at the formula for expected utility that we discussed last week, we can see that the probability of the state of the world doesn’t take into account the decision at all.

Expected \ Utility (Decision) =\sum_{i}Probability(WorldState_{i})\times Utility(WorldState_{i}\ \wedge \ Decision )

The term to look at is:


This fails to take into account the fact that the decision can influence the world state (which patch the creature choses influences whether it is likely to survive). This equation is unable to handle this model of reality. So what equation should you reprogram into your artificial creature? This issue is the focus of one of the principle debates in decision theory.

Historically, the debate was between two main decision theories: evidential decision theory and causal decision theory. Broadly, evidential decision theory says that an agent should ask what evidence a decision provides about the world state. So the creature heading toward the sheltered patch of food would provide evidence that the creature is more likely to survive. Causal decision theory, on the other hand, says that an agent should ask what the causal influence of the decision is on the world state. So heading for the sheltered patch causes the creature to be more likely to survive.

The next few posts will outline both of these theories in more detail.

Sequence index: An introduction to decision theory

What is decision theory?

This is part of the sequence, “An introduction to decision theory”. While I have previously written about decision theory in relation to the views of a specific online community, this will be a broader and deeper introduction to decision theory starting from the basics and moving to more recent issues under discussion.

At some point in the history of the universe, a decision had never been made. The entire history of the universe to that point had just unfurled quietly without a single choice being made. The first warning sign that this era was over was the beginning of life. However, early living things would have floated or stayed still or did whatever the world told them to. They would not have intervened in the world or pondered about the rational response to the environment.  Not only could it not decide but, if it could have, it would still have been unable to impose its will on the world.

Then a form of life developed that was able to interact with the world. Maybe it gained the power of locomotion. Maybe it gained the power to cling – to decide when not to be moved by the elements. Maybe it gained any number of abilities but, for whatever reason, suddenly it was able to intervene in the world. And a new question came about: how should it act so as to achieve this. This first intervener would have had few cognitive tools to process this question.

The next development is the most important one to our story. Not first life. Not first intervention. But first decision. A form of life that could not only intervene in the world but could decide how to do so. Life that could choose where to move and when to move. And the question became more important: how should one best take advantage of this ability to decide?

This sequence will explore decision theory, one attempt to answer this question, at least in the abstract.

So imagine then a creature – maybe not the first decison maker but one of its descendents. This is a simple creature that gains its energy by eating algae that floats in the water and that survives in virtue of both eating enough and by avoiding being eaten by its preditors. This creature is faced with a decision: to its left there is a substantial patch of algae that would feed it comfortably for some time. To its right, there is a less substantial patch that it could nevertheless survive on, albeit not for as long. The creature is faced with the decision of which patch to approach.

However, here’s the complication: the more substantial path of algae is in an area that would be more exposed to predators if they were around. The less substantial patch is in a more sheltered area. In other words, if preditors are likely to be around then the creature would be better going to the less substantial patch. If they are unlikely to be around, it is better going to the more substantial patch. This allows us to introduce our first tool from decision theory: the utility table.

Predators present Predators absent
Substantial patch -10 5
Less substantial patch 2 2

This utility table contains almost everything that decision theory uses to determine the rational decision. Across the top are the possible states of the world (the world can either be such that the predators are present or absent). Along the side are the possible decisions (the creature can head for either the substantial or the less substantial patch). The table cells then contain a utility value for each possible combination of a decision and world state (so if the creature headed to the substantial patch and predators were present then this would be worth a utility of -10). The utility value is simply a measure of how much the decision maker values the outcome. From this table, the rational decision can be determined given the state of the world. So if the predators are present, then the less substantial path is clearly the best option.

Generally, though, decision theory deals with decision making under conditions of uncertainty. This means it deals with circumstances where the state of the world isn’t known. So, our creature might not know whether there are predators around today. In this case, it could make its judgement based on how likely the predators are to be around. This is the final piece of information that decision theory requires: the probablity that the world is in a certain state.

It combines all of this information together into a single formula for each decision which calculates the expected utility of each action. The action with the highest expected utility is the rational decision. (ETA: Those who have studied decision theory before might be expecting the probability here to take into account the decision in some way. This issue will be discussed in the next post).

Expected \ Utility (Decision) =\sum_{i}Probability(WorldState_{i})\times Utility(WorldState_{i}\ \wedge \ Decision )

What does this equation mean? Well basically, we take all of the utilities for each possible decision and add them together. So in the case of the creature deciding to go for the substantial path this is (-10 + 5 = -5) and for the unsubstantial patch is (2 +2 = 4). However, if the state of the world leading to that utility is unlikely to happen, we want that utility to count for less because the creature is unlikely to get it. So all of the utility is weighted by the probability of receiving it (in other words, the probability of the world state occuring as the world state determines what utility the creature will receive).

This is decision theory in its basic form.

The next post will look at a problem with this approach.