Random Statistics

Conditional Probability and Bayes's Theorem

Conditional probability is one of the most intuitive subjects in Probability theory, which you can ascertain the basics of independence in statistical terms. Conditional probability examines the probability of an event given that another event occurs, which can be briefly explained as the probability of intersection of these two events divided by the probability of an event that is required to occur. Let's suppose the probability of event A in the condition of the event B occuring. Then we will see:

\begin{equation}p(A|B)=\frac{p(A\cap{B})}{p(B)}\end{equation}

This equation basicaly indicates that probability of event A is constrained by the event B. Thus we divide the probability of intersection of these events with probability of event B, which restricts the universal set.

Most of the time, we are confused conditional probability and the one of the important features of Probability theory which is the state of mutually exclusive. In the case of mutually exclusive, the probability of intersection of these two events is equal to zero.

\begin{equation}p(A\cap{B})=0\end{equation}

However, for the conditional probability we try to observe whether the probability of an event is dependent on another event. Yes, if two of these given events are mutually exclusive, then conditional probabilities will be equal to zero. But, rather than following this path, the term statistical independence will arise. To find whether two events are independent from each other, mutually exclusiveness does not help us to find a sufficient conclusion (if two events are mutually exclusive and indepent from each other, at least one of the events has zero probability of occuring).

So, if the probability of event A is equal to conditional probability of event A in the condition of event B, then these two events are independent. Because, the existence of event B as a constraint on event A does not lead to change in probability of event A.

\begin{equation}p(A|B)=\frac{p(A\cap{B})}{p(B)}=p(A)\end{equation}

\begin{equation}p(A\cap{B})=p(A)\times p(B)\end{equation}

Most appopriate example would be coin toss. We know that the conclusion we will get for tossing a coin is not dependent on the conclusion that we got from previous toss. Thus, we multiply each try with each other in order to find the probability:

\begin{equation}p(E_1\cap{E_2}\cap{E_3})=p(E_1)\times p(E_2)\times p(E_3)\end{equation}

To briefly explain the difference between mutually exclusive and statistical independence, take two different events that are not dependent to each other in logical sense. Suppose first event is drinking water and second one is a car passes in front of the home. These events cannot be related in a logical base. However, one might experience a time period while drinking water a car passes in front of his home. This reduces the mutually exclusiveness of these two events, though this does not mean that these two events are not independent.

For the Bayes's theorem I would like to indicate that there is a whole subject about the Bayesian Statistics where certain features of Bayesian indicators are discussed. However, we will briefly evaluate the Bayes's Theorem in order to simplify the process. Basicly, the Bayes's Theorem is a beneficial tool for renovating our beliefs based on previous beliefs and additional information.

Consider a disease emerged. It is estimated that only 1% of the population have this disease. There is also a test process to identify whether the person tested has disease. There is an error chance when it comes to the test's conclusion. Let's say 99% of tests that were completed have correct answers, which indicates the relaibility of the test. Therefore, the test truly recognized the postive and negative results. With these informations we can deduce the probability of having disease when the test result is positive. First, denote the necessary objects to construct the probability scheme:

\begin{equation}H=\textrm{Having Disease}\;\;\;\bar{H}=\textrm{Not Having Disease}\end{equation}

\begin{equation}P=\textrm{Positive Test Result}\;\;\; \bar{P}=\textrm{Negative Test Result}\end{equation}

\begin{equation}p(H|P)=\frac{p(H\cap{P})}{p(P)}\end{equation}

Because the information of intersection of having disease and positive test result is not given, we can rewrite it as:

\begin{equation}p(P|H)=\frac{p(H\cap{P})}{p(H)}\end{equation}

\begin{equation}p(P|H)\times p(H)=p(H\cap{P})\end{equation}

\begin{equation}p(H|P)=\frac{p(P|H)\times p(H)}{p(P)}\end{equation}

Until this point we achieved most of the process, however there is one last step we have to take. This one is to find the probability of positive test result (p(P)). Because it includes false positives and false negatives, we can find it simply by adding the intersection of positive test result & having disease and positive test result & not having disease, thus we will be able to obtain the probability of positive test result:

\begin{equation}p(P)=p(H\cap{P})+p(\bar{H}\cap{P})\end{equation}

\begin{equation}p(H\cap{P})=p(P|H)\times p(H)\end{equation}

\begin{equation}p(\bar{H}\cap{P})=p(P|\bar{H})\times p(\bar{H})\end{equation}

\begin{equation}p(P)=p(P|H)\times p(H)+p(P|\bar{H})\times p(\bar{H})\end{equation}

As you can see, with a simple algebraic processes we reached a point where we implemented the probability of positive test result given that tested person has disease ,which is a given information, into our main equation. So we can basicly replace these signs with numbers:

\begin{equation}p(H|P)=\frac{p(P|H)\times p(H)}{p(P|H)\times p(H)+p(P|\bar{H})\times p(\bar{H})}\end{equation}

\begin{equation}p(H|P)=\frac{0.99\times 0.01}{(0.99\times 0.01)+(0.01\times 0.99)}=\frac{0.0099}{0.0198}=0.5\end{equation}

Additional to these solution technics, there is an useful example of how one can use this powerful tool to create harmful consequences for society.

Random Statistics

No comments:

Post a Comment