Today

By the end of today you will

be able to define and compute marginal, joint and conditional probabilities
identify when events are independent
apply Bayes’ theorem to examine COVID test specificity

library(tidyverse)
library(knitr)

Definitions

Let A and B be events.

Marginal probability: The probability an event occurs regardless of values of the other event
- P(A)
- Example: What’s the probability a student in STA199 favors dogs?
Joint probability: The probability two or more events simultaneously occur
- Example: What’s the probability a student is a junior and favors dogs?
- P(A and B)
Conditional probability: The probability an event occurs given the other has occurred
- P(A|B) or P(B|A)
- Eample: What is the probability a student is a junior given they favor dogs?
Independent events: Knowing one event has occurred does not lead to any change in the probability we assign to another event.
- P(A|B) = P(A) or P(B|A) = P(B)
- Example: P(Junior | dogs) = P(junior)

Bayes’ Theorem

The global coronavirus pandemic illustrates the need for accurate testing of COVID-19, as its extreme infectivity poses a significant public health threat. Due to the time-sensitive nature of the situation, the FDA enacted emergency authorization of a number of serological tests for COVID-19 in 2020. Full details of these tests may be found on its website here.

We will define the following events:

Pos: The event the Alinity test returns positive.
Neg: The event the Alinity test returns negative.
Covid: The event a person has COVID
No Covid: The event a person does not have COVID

The Abbott Alinity test has an estimated sensitivity of 100%, P(Pos | Covid) = 1, and specificity of 99%, P(Neg | No Covid) = 0.99.

Suppose the prevalence of COVID-19 in the general population is about 2%, P(Covid) = 0.02.

Exercise 1: Use the Hypothetical 10,000 to calculate the probability a person has COVID given they get a positive test result, i.e. P(Covid | Pos).

	Covid	No Covid	Total
Pos
Neg
Total			10000

Exercise 2 Use Bayes’ Theorem to calculate P(Covid|Pos).

Simpson’s paradox

This example comes from Confounding and Simpson’s paradox¹ by Julious and Mullee.

The data examines 901 individuals with diabetes and includes the following variables

insulin_dep: whether or not the patient has insulin dependent or non-insulin dependent diabetes
less_than_40: whether or not the individual is less than 40 years old
survival: whether or not the individual survived the length of the study

diabetes = read_csv("data/diabetes.csv")

One might be interested in the mortality associated with each type of diabetes.

# code here

Is the aggregate reported above misleading and if so, why?

# code here

Julious, S A, and M A Mullee. “Confounding and Simpson’s paradox.” BMJ (Clinical research ed.) vol. 309,6967 (1994): 1480-1. doi:10.1136/bmj.309.6967.1480 ↩︎

Conditional Probability

02.11.2022

Bulletin

Today

Definitions

Bayes’ Theorem

Simpson’s paradox