![Hands-On Mathematics for Deep Learning](https://wfqqreader-1252317822.image.myqcloud.com/cover/81/36698081/b_36698081.jpg)
Conditional probability
Conditional probabilities are useful when the occurrence of one event leads to the occurrence of another. If we have two events, A and B, where B has occurred and we want to find the probability of A occurring, we write this as follows:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_1032.jpg?sign=1739695353-FDIwNpTjYxa4718vjIvxKUcMJqacrkWC-0-98fce42dd37cbf6842d4f7e7ae5bfc73)
Here, .
However, if the two events, A and B, are independent, then we have the following:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_1104.jpg?sign=1739695353-v0wNbt4wcdtpRQzVZh9lBoYhfuya9x7D-0-53b4febf063984759b474663e0e8c1b5)
Additionally, if , then it is said that B attracts A. However, if A attracts BC, then it repels B.
The following are some of the axioms of conditional probability:
.
.
.
is a probability function that works only for subsets of B.
.
- If
, then
.
The following equation is known as Bayes' rule:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_1751.jpg?sign=1739695353-lUd5AXP3fgLWHYefhzgsG6BJoQ0OPZTI-0-b2dd8080cafa15ed121936c2ffc9ace6)
This can also be written as follows:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_68.jpg?sign=1739695353-OiHiiczCquWJiWpFeL5kYPAtyyWHmeOv-0-b44251e5a1524d5e29455c653beb812a)
Here, we have the following:
is called the prior.
is the posterior.
is the likelihood.
acts as a normalizing constant.
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_273.jpg?sign=1739695353-cWFCCJtVnH0SwTprBDJmM7lIUR26odJB-0-8558e5abd8fe5af4bf94355d0e8184b0)
Often, we end up having to deal with complex events, and to effectively navigate them, we need to decompose them into simpler events.
This leads us to the concept of partitions. A partition is defined as a collection of events that together makes up the sample space, such that, for all cases of Bi, .
In the coin flipping example, the sample space is partitioned into two possible events—heads and tails.
If A is an event and Bi is a partition of Ω, then we have the following:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_225.jpg?sign=1739695353-XoX2KgKdRcicYUQxENFCfxkTY02zphOW-0-461aa867e243c4994523ee71bb66cf68)
We can also rewrite Bayes' formula with partitions so that we have the following:
![](https://epubservercos.yuewen.com/FF11E0/19470372701459106/epubprivate/OEBPS/Images/Chapter_58.jpg?sign=1739695353-ks6Prm2hhMYL2nda3Y9E54VfiuGeKrLV-0-679557b3ea871f6c9794206e2112ecdd)
Here, .