## An introduction to Bayesian networks in causal modeling

**This is post 4 in a sequence exploring formalisations of causality entitled “Reasoning with causality”**

**The previous post was Reasoning with causality: An example**

The remainder of this sequence is going to depart from the path previously indicated (continuing to read Pearl’s *Causality*) and will instead explore the use of Bayesian networks in causal modeling (in doing so, it will also discuss probabalistic vs determinstic causality) by summarising a technical report titled *Causal Reasoning with Causal Models* (Which should link to, http://www.csse.monash.edu.au/~korb/pubs/techrept.pdf). This first post will introduce the use of Bayesian networks in causal modeling.

**What are Bayesian Networks**

A Bayesian Network is a directed acyclical graph (DAG) and their conditional probabilities. What does this mean? Well take the graph (series of nodes connected by edges) below:

1.) Directed simply means it has arrows pointing out from each node.

2.) Acylical means that if you follow the arrows of any path you will never return to your starting point.

So now we know what the DAG aspect of the definition above means. How about the fact that it captures their conditional probabilities. While that simply means that the Bayesian Network (or Bayes Net) also captures the way the variables are conditionally dependent on one another. Take the central node (“grass wet”). Let’s define the conditional probabilities for this node in the form of a table:

Sprinkler | Rain | Grass wet | Grass dry | |

T | T | 1 | 0 | |

T | F | 0.7 | 0.3 | |

F | T | 0.9 | 0.1 | |

F | F | 0 | 1 |

So this says that if it rained and the sprinkler was on last night, the grass will definitely be wet. If just the sprinkler was on, there’s an 0.3 chance that the grass has dried by now and so on.

So a Bayes Net is a DAG combined with the conditional probabilities that hold between the nodes. Bayes nets are used because they make many problems more computationally cheap (if the list of relevant conditional probabilities is small enough).

**Bayesian Networks and conditional dependencies**

If there are no arrows between nodes in a Bayes Net then this means they must be probabalistically independent. Which is to say:

P(A | B) = P(A)

Conditional independence is an extension of this to situations where a third variable induces independence. So, in the above graph, rain and the paper being wet are conditionally independent because they are “screened off” by the grass being wet.

There are two properties related to this:

1.) A Bayes Net is said to have the Markov Property if all of the conditional independences in the graph are true of the system being modelled (the Markov Property simply means that the future state of a system depends only on the present state and not on the past ones. This is clearly related to conditional independence). It can also be called an independence-map or I-map of the system.

2.) If all of the dependencies in the Bayes Net are true in the system then it’s said to be faithful (or a D-map, Dependence Map) to the system.

If a Bayes Net has both of these properties then it’s said to be a perfect map. Generally in Bayes Nets, we want the I-map to be minimal (such that if we removed any nodes, it would no longer be an I-map).

**Bayesian Networks and Causality**

A Bayes Net becomes a causal model if each of the arrows in the graph can be given a causal interpretation. Each arrow then represents direct causality. Indirect causality is based on paths that always follow arrows. So in the above graph, rain is indirectly related to the paper being wet via the path (rain -> grass wet -> paper wet). Sprinkler being on is not indirectly causally related to rain because the path from sprinkler to rain does not always follow the direction of the arrows.

A path from X to Y is blocked by Z if X and Y are conditionally independent given Z (at least under some basic assumptions). If the graph has the Markov Property than the equivalent procedure to blocking is d-seperation which is best explained as follows. Given two variables, A and B there are four ways that a path can go from A to B through C in an undirected graph:

1.) They can go via a chain A->C->B. In this case A and B are d-seperated given a set of nodes Z if C is in Z (if you know C, knowing A won’t tell you anything more about B).

2.) They can go via a chain B->C->A (reasoning as above)

3.) They can both be caused by C. In this case, A and B are d-seperated given Z is C in in Z (once again, knowing C means that knowing A or B tells you nothing more about the other variable).

4.) They can both cause C. In this case if C isn’t in Z then there is no dependence between A and B by default. They may both cause C but knowing one doesn’t tell you anything about the other. On the other hand, if C is in Z then A and B are causally related. Look at the graph earlier – here if we know whether the grass is wet then there is a conditional relationship between rain and sprinkler – namely that if it didn’t rain the sprinkler must have been on and vice versa. So A and B are d-desperated given Z if C is not in Z.

All up, A and B are d-seperated given Z if all paths from A to B meet one of the above criteria.

**Conclusions**

This post has explored Bayesian Networks and how these can be used as causal models. In the following posts it will explore objections to the use of Bayes Nets to represent causality, the debate over probablistic vs deterministc interpretations of causality and the difference between type and token causality.

**The next post in the sequence is **

**Is a causal interpretation of Bayes Nets fundamental?**