Home > Less Wrong and decision theory > Decision theories and strange scenarios

Decision theories and strange scenarios

This is part 4 of a sequence titled “Less Wrong and decision theory”

The previous post is “The Smoking Lesion: A problem for Evidential Decision Theory”

Newcomb’s Problem and the Smoking Lesion Problem are both a bit strange. To some people, that seems to be a good enough reason to ignore them when we’re considering decision theories. But if you’re attempting to create mathematical, perfect decision theories then these sorts of problems indicate that something’s gone wrong: And maybe it’s the sort of something that would lead to the wrong decisions being made in more realistic situations.
These scenarios are tests. They’re not intended as real world examples but rather are seen as ways of determining whether a new decision theory is an improvement on an old one. In developing decision theories, those on Less Wrong are hoping to find a decision theory that gets the desire answer for the two previous problems and for the following ones.

Parfit’s Hitchiker

The first of these is Parfit’s Hitchiker. You are lost in the desert with no money or goods when another person drives up to you. Both of you are perfect rationalists and, when you ask for a lift he says yes, if you go to a cash machine when you get to town and pay him $100. Of course, you say yes.

The question is: Once you get to town, should you pay the driver?

The normal answer is that once you get to town you will already be safe and will have no motivation to pay the driver. However, the driver will realise this and, instead of picking you up will drive off and leave you to die.

Desired response: Paying the driver once you reach town because then they will bother to pick you up and you’ll survive.

See also: http://lesswrong.com/lw/135/timeless_decision_theory_problems_i_cant_solve/

Counterfactual Mugging

Omega, from Newcomb’s Problem, approaches you and says, “I just flipped a fair coin. It came up tails and so you owe me $100. However, if it had come up heads I would have given you $1 000 000 but only in the situation that you would give me $100 were it to come up tails.”

Desired response: Paying the $100, for a similar reason to the reason that you pay the driver in Parfit’s Hitchhiker.  If you ask what the best decision is at the time, it might not be to pay the $100 but if you ask  what’s the best decision theory to follow, it’s one that would lead you to pay the $100 (Question: Is the consensus on this question as confidently one sides on Less Wrong as, say, Newcomb’s Problem?)

See also: http://lesswrong.com/lw/3l/counterfactual_mugging/

Prisoner’s Dilemma

You and a fellow criminal are both pulled into the local police station for a crime you have committed. In questioning (in separate rooms), you are both given two choices: You can stay silent and not admit to the crime (and thereby hope to cooperate with the other criminal) or you can admit to the crime and get a plea bargain for grassing up the other criminal as well (and thereby be defecting).

If you both cooperate and stay silent, you get sent to jail for two weeks on a lesser charge.

If one of you cooperates and the other defects then the cooperating criminal gets 6 months in jail and the defecting criminal gets let off with 1 week in jail as part of the plea bargain.

If both criminals defect then you both get 2 months in jail.

Standard reasoning says you should defect because if:

1.)  The other play cooperates then you would get two weeks jail time if you cooperated but only one week if you didn’t.

2.) The other play defects then you will get 2 months jail time if you defect and 6 months jail time if you cooperate.

So defecting is always beneficial.

The problem is that both criminals will realise this and will both defect. You will end up in jail for 2 months. If you both cooperated, on the other hand, you would only go to jail for two weeks.

Desired response: If two parties both use the same decision theory, they should both cooperate as they will both make the same decision and both cooperating is better than both defecting.

See also: http://lesswrong.com/lw/tn/the_true_prisoners_dilemma/

http://lesswrong.com/lw/to/the_truly_iterated_prisoners_dilemma/

http://lesswrong.com/lw/do/reformalizing_pd/

Appendix 1: Notes

An important note: While there is a reasonably strong consensus on these answers within Less Wrong, external to that group opinions may vary or there may be a consensus counter to the position presented here.

Conclusion

This sequences has argued what I see as the standard Less Wrong view: Current decision theories are flawed due to the answers they provide to a set of decision scenarios. These scenarios are then important ways of testing the new decision theories that are to be developed. The next, and final, post will explore one of these new theories: Eliezer’s Timeless Decision Theory.

The final post is “An introduction to Timeless Decision Theory”.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: