Bayesian Data Analysis

Bayesian Network

set of events some causally related to others with certain probability

certain factors unknowns or partially known factors
model systems with probability
interested in joint probability distributions
also given certain sub variable is true, what is prob of macro var
random vars: x1 x2, x3
know nothing about intercausal relationships
if you are provided the probability distribution P( x1, x2, x3 ), want to try to compute the following:
- P[x1 = 0, x2 = 1, x3 = 0]
- P[x1, x2] = P(x1, x2, x3) + P(x1, x2, x3bar)
- P[x1 | x2, x3 ] = read "the probability of x1, given x2 and x3"

Bayesian network is way to reduce size of representation, a "succinct way" of representing distribution
store probability distribution explicitly in a table
x1 .. x10 are booleans
what is size of table for set of vars P[ x1 ... x10] = 2^n
how can rewrite joint pdf P[x1, x2, ..., x10]= P[x1| x2, ..., x10] * P[x2, ..., x10]
= P[x1| x2, ..., x10] * P[x2 | x3, ..., x10] ... P[Xn-1|Xn]*P[Xn]
P[Xi|Xi+1, ..., Xn] = P[Xi] if Xi is totally independent of the others
sometime can also be conditionally independent, only dependent on a subset of the other variables
the variable on which P[Xi] depends "subsumes" the other variables
belief network - order of variables matters when setting up dependencies in belief network.
Count parents of each node to figure out size of conditional probability tables
If use improper ordering, results in valid representation of joint probabilty funtion, but would require producing conditional probability tables which aren't natural/difficult to obtain experimentally. could also result in inflation of conditional tables / size of table representation is large compared to others

Choose the set of relevant set of variables X that describe the domain
Choose an ordering for the variables (very important step)
While there are variables left:
1. dequeue variable X off the queue and add node
2. Set Parents(X) to some minimal set of existing of existing nodes such that the conditional independence is satisfied
3. Define the conditional probability table