Difference between revisions of "Bayesian Data Analysis"

Revision as of 16:25, 25 September 2019

Definitions

$p(C_{k}\mid \mathbf {x} )={\frac {p(C_{k})\ p(\mathbf {x} \mid C_{k})}{p(\mathbf {x} )}}={\text{posterior}}={\frac {{\text{prior}}\times {\text{likelihood}}}{\text{evidence}}}$
In practice there's is only interest in the numerator of that fraction, because the denominator does not depend on C, and the values on feature $x_{i}$ are given, so the denominator is effectively constant.
The numerator is equivalent to the joint probability model
If we assume each feature is conditionally independent of every other, then the joint model can be expressed as

${\begin{aligned}p(C_{k}\mid x_{1},\dots ,x_{n})&\varpropto p(C_{k},x_{1},\dots ,x_{n})\\&=p(C_{k})\ p(x_{1}\mid C_{k})\ p(x_{2}\mid C_{k})\ p(x_{3}\mid C_{k})\ \cdots \\&=p(C_{k})\prod _{i=1}^{n}p(x_{i}\mid C_{k})\,,\end{aligned}}$

Classifier combines probability model with a decision rule, i.e. maximum a posteriori

Conditional probability

What is the probability that a given observation D belongs to a given class C, $p(C\mid D)$
"The probability of A under the condition B" $p(A\mid B)$
There need not be a causal relationship
Compare with UNconditional probability $p(A)$
If $p(A\mid B)=p(A)$ , then events are independent, knowledge about either event does not give information on the other. Otherwise, $P(A\cap B)=P(A)P(B).$
Don't falsely equate $p(A\mid B)$ and $p(B\mid A)$
Defined as the quotient of the joint of events A and B and the probability of B: $P(A\mid B)={\frac {P(A\cap B)}{P(B)}},$ , where numerator is the probability that both events A and B occur.
Joint probability $P(A\cap B)=P(A\mid B)P(B)$

General

Compare vs. Frequentist
Naive Bayes youtube vid
Pros:
- Easy and fast to predict a class of test dataset
- Naive Bayes classifier performs better compared to other models assuming independence
- Performs well in the case of categorical input variables compared to numerical variables
Cons
- zero frequency (solved by smoothing techniques like laplace estimation, or adding 1 to avoid dividing by zero)
- Bad estimator - probability estimates are understood to not be taken too seriously
- Assumption of independent predictors, which is almost never the case.
Applications
- Credit scoring
- Medical
- Real time prediction
- Multi-class predictions
- Text classification, spam filtering, sentiment analysis
- recommendation filtering
Gaussian naive bayes: assume continuous data has Gaussian distribution
The multinomial naive Bayes classifier becomes a linear classifier when expressed in log-space

Bayesian Network

Reasoning under uncertainty

set of events some causally related to others with certain probability

certain factors unknowns or partially known factors
model systems with probability
interested in joint probability distributions
also given certain sub variable is true, what is prob of macro var
random vars: x1 x2, x3
know nothing about intercausal relationships
if you are provided the probability distribution P( x1, x2, x3 ), want to try to compute the following:
- P[x1 = 0, x2 = 1, x3 = 0]
- P[x1, x2] = P(x1, x2, x3) + P(x1, x2, x3bar)
- P[x1 | x2, x3 ] = read "the probability of x1, given x2 and x3"

bayesian network

Bayesian network is way to reduce size of representation, a "succinct way" of representing distribution
store probability distribution explicitly in a table
x1 .. x10 are booleans
what is size of table for set of vars P[ x1 ... x10] = 2^n
how can rewrite joint pdf P[x1, x2, ..., x10]= P[x1| x2, ..., x10] * P[x2, ..., x10]
= P[x1| x2, ..., x10] * P[x2 | x3, ..., x10] ... P[Xn-1|Xn]*P[Xn]
P[Xi|Xi+1, ..., Xn] = P[Xi] if Xi is totally independent of the others
sometime can also be conditionally independent, only dependent on a subset of the other variables
the variable on which P[Xi] depends "subsumes" the other variables
belief network - order of variables matters when setting up dependencies in belief network.
Count parents of each node to figure out size of conditional probability tables
If use improper ordering, results in valid representation of joint probabilty funtion, but would require producing conditional probability tables which aren't natural/difficult to obtain experimentally. could also result in inflation of conditional tables / size of table representation is large compared to others

Incremental Network Construction

Choose the set of relevant set of variables X that describe the domain
Choose an ordering for the variables (very important step)
While there are variables left:
1. dequeue variable X off the queue and add node
2. Set Parents(X) to some minimal set of existing of existing nodes such that the conditional independence is satisfied
3. Define the conditional probability table

inferences using belief networks

diagnostic inferences (from effects to causes
causal inferences (given symptoms, what is probability of disease)
intercausal inferences
mixed inferences

@@ Line 1: / Line 1: @@
+==Definitions==
+* <math>p(C_k \mid \mathbf{x}) = \frac{p(C_k) \ p(\mathbf{x} \mid C_k)}{p(\mathbf{x})} = \text{posterior} = \frac{\text{prior} \times \text{likelihood}}{\text{evidence}}</math>
+* In practice there's is only interest in the numerator of that fraction, because the denominator does not depend on C, and the values on feature <math>x_i</math> are given, so the denominator is effectively constant.
+* The numerator is equivalent to the joint probability model
+* If we assume each feature is conditionally independent of every other, then the joint model can be expressed as
+<math>
+\begin{align}
+p(C_k \mid x_1, \dots, x_n) & \varpropto p(C_k, x_1, \dots, x_n) \\
+                            & = p(C_k) \ p(x_1 \mid C_k) \ p(x_2\mid C_k) \ p(x_3\mid C_k) \ \cdots \\
+                            & = p(C_k) \prod_{i=1}^n p(x_i \mid C_k)\,,
+\end{align}
+</math>
+* Classifier combines probability model with a decision rule, i.e. maximum a posteriori
+===Conditional probability===
+* What is the probability that a given observation D belongs to a given class C, <math>p(C \mid D)</math>
+* "The probability of A under the condition B" <math>p(A \mid B)</math>
+* There need not be a causal relationship
+* Compare with UNconditional probability <math>p(A)</math>
+* If <math>p(A \mid B) = p( A )</math>, then events are independent, knowledge about either event does not give information on the other. Otherwise, <math>P(A \cap B) = P(A) P(B).</math>
+* Don't falsely equate <math>p(A \mid B)</math> and <math>p(B \mid A)</math>
+* Defined as the quotient of the joint of events A and B and the probability of B: <math>P(A \mid B) = \frac{P(A \cap B)}{P(B)},</math>, where numerator is the probability that both events A and B occur.
+* Joint probability <math>P(A \cap B) = P(A \mid B)P(B)</math>
+===General===
+* Compare vs. Frequentist
+* [https://www.youtube.com/watch?v=CPqOCI0ahss Naive Bayes youtube vid]
+* Pros:
+** Easy and fast to predict a class of test dataset
+** Naive Bayes classifier performs better compared to other models assuming independence
+** Performs well in the case of categorical input variables compared to numerical variables
+* Cons
+** zero frequency (solved by smoothing techniques like laplace estimation, or adding 1 to avoid dividing by zero)
+** Bad estimator - probability estimates are understood to not be taken too seriously
+** Assumption of independent predictors, which is almost never the case.
+* Applications
+** Credit scoring
+** Medical
+** Real time prediction
+** Multi-class predictions
+** Text classification, spam filtering, sentiment analysis
+** recommendation filtering
+* Gaussian naive bayes: assume continuous data has Gaussian distribution
+* The multinomial naive Bayes classifier becomes a linear classifier when expressed in log-space
 ==Bayesian Network==
 * Reasoning under uncertainty

Difference between revisions of "Bayesian Data Analysis"

Revision as of 16:25, 25 September 2019

Contents

Definitions

Conditional probability

General

Bayesian Network

bayesian network

Incremental Network Construction

inferences using belief networks

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools