Bayesian Decision Methods
| Type |
|---|
| Generative Classification Model |
Also see, 4. Bayesian Decision Theory
Given observation x, the decision is based on posterior probability:
- Decide $y_1$, if $P(y_1 \vert x) > P(y_2 \vert x)$
- Decide $y_2$, if $P(y_2 \vert x) > P(y_1 \vert x)$
Note that, $P(y_1 \vert x) = {P(x \vert y_1).P(y_1) \over P(x)}$ and $P(y_2 \vert x) = {P(x \vert y_2).P(y_2) \over P(x)}$ so the probability $P(x)$ does not matter in our decision (same for both).
Probability of error:
Goal is to minimize error (based on single instance):
Minimizing average error:
Generalizing for more classes:
- Feature vector $x = (x1,x2,…,x_d)$ ∈ $R^d$: allow use of more than one feature
- $y1,y2,…,y_c$: finite set of c states of nature, i.e., categories (can be more than two)
- $\alpha_1,\alpha_2,…,\alpha_a$: a finite set of possible actions
- $λ(α_i \vert y_i)$: loss function, describes the loss incurred for taking action $\alpha_i$ when state of nature is $y_i$
- $P(y_i)$: prior probability that state of nature is $y_i$
- $p(x \vert yi)$: state conditional probability for $x$
The expected loss, or conditional risk, of taking action $\alpha_i$ is:
Choose $\alpha(x)$ that minimizes overall risk: