Observational studiesWhat If: Chapter 3Elena Dudukina2020-11-121 / 17

Observational studies as conditionally randomized experiments

If three assumptions hold
- Consistency: well-defined intervention (or all versions of the treatment are captured)
- Exchangeability: conditional probability of receiving each level of the treatment depends only on measured covariate(s), L
- Positivity: the probability of receiving each level of treatment conditional on L is greater than zero, i.e., positive
- Non-interference: PO outcomes of one individual is independent of PO of other individuals
These conditions are identifiability conditions
- Causal interpretation = data + assumptions
- Identifiability assumptions can be tracked on a DAG
In ideal randomized experiments the identifiability conditions hold by design

2 / 17

Instrumental variablesDemand different assumptions and different set of identifiability criteria
3 / 17

ExchageabilityYa⊥⊥ AYa⊥⊥ A
Had the treated be untreated their risk of PO YaYa would have been the same
Confounding is a lack of exchangeability
Confounders are variables, which when adjusted for, restore exchangeability, or remove confounding
Untestable
4 / 17

# association
greek_gods_condrand %>% 
  group_by(A) %>% 
  count(Y_obs) %>% 
  mutate(
    denominator = sum(n),
    risk = round(n/sum(n), digits = 2)
  ) %>% 
  filter(Y_obs == 1)

## # A tibble: 2 x 5
## # Groups:   A [2]
##       A Y_obs     n denominator  risk
##   <dbl> <dbl> <int>       <int> <dbl>
## 1     0     1     3           7  0.43
## 2     1     1     7          13  0.54

5 / 17

# when controlling confounding by L using stratification
greek_gods_condrand %>% 
  group_by(L, A) %>% 
  count(Y_obs) %>% 
  mutate(
    denominator = sum(n),
    risk = round(n/sum(n), digits = 2)
  ) %>% 
  filter(Y_obs == 1)

## # A tibble: 4 x 6
## # Groups:   L, A [4]
##       L     A Y_obs     n denominator  risk
##   <dbl> <dbl> <dbl> <int>       <int> <dbl>
## 1     0     0     1     1           4  0.25
## 2     0     1     1     1           4  0.25
## 3     1     0     1     2           3  0.67
## 4     1     1     1     6           9  0.67

6 / 17

Conditionally randomized experiment

If L is the only source of confounding and conditional exchangeability holds $Y^{a} ⊥⊥ A | L$
- this is "an observational study in which the probability of treatment A = 1 is 0.75 among those with L = 1 and 0.50 among those with A = 0"
- this is "a (non blinded) conditionally randomized experiment in which investigators randomly assigned treatment A = 1 with probability 0.75 to those with L = 1 and 050 to those with L= 0"

greek_gods_condrand %>% 
  group_by(L) %>% 
  count(A) %>% 
  mutate(
    denominator = sum(n),
    pr_A = round(n/sum(n), digits = 2)
  ) %>% 
  filter(A == 1)

## # A tibble: 2 x 5
## # Groups:   L [2]
##       L     A     n denominator  pr_A
##   <dbl> <dbl> <int>       <int> <dbl>
## 1     0     1     4           8  0.5 
## 2     1     1     9          12  0.75

7 / 17

Expert knowledgeSince exchangeability is untestable, domain knowledge is necessary to guide our inferences on whether or not exchangeability assumption may or may not hold
8 / 17

PositivityPositive probability of observing each level of treatment in each strata of L
Pr(A=a|L=l>0)Pr(A=a|L=l>0) for all aa and ll
Only relevant for variables L required for exchangeability
Can be empirically verified (see chapter 12)
9 / 17

ConsistencyWe observe PO - the one under actually received treatmentPr[Ya=1|A=1]=Pr[Y=1|A=1]Pr[Ya=1|A=1]=Pr[Y=1|A=1]

Unpacking consistencydefinition of Ya=1Ya=1 via detailed aa (given value of treatment)
linking the observed and the counterfactual outcome

10 / 17

Well-defined intervention paradigm

$Y^{a}$

Treatment as several versions of the intervention
- Are all observed and measured?
- Do all versions of the treatment have the same causal effect?
- Not well-defined values of $a$ lead to not well-defined PO $Y^{a}$ under the levels of treatment and the causal contrast $P r [Y^{a = 1} = 1] - P r [Y^{a = 0} = 1]$ is not well-defined
- Obesity/weight-loss example : duration, frequency, intensity, and type of the intervention of being "less obese"
- Challenging causal questions involving biological and social constructs/SES (p. 34)
- Sufficiently well-defined, meaning in the detail enough for causal inference
- Domain knowledge
- Communication of the results

11 / 17

Counterfactuals and observed data"Hypothetical intervention" must be linked to actually observed version of treatment, otherwise mathematical notation of consistency Pr[Ya=1|A=1]=Pr[Y=1|A=1]Pr[Ya=1|A=1]=Pr[Y=1|A=1] cannot be translated into the "real world" and no causal inference is possible
12 / 17

Counterfactuals and observed data

"Hypothetical intervention" must be linked to actually observed version of treatment, otherwise mathematical notation of consistency $P r [Y^{a = 1} | A = 1] = P r [Y = 1 | A = 1]$ cannot be translated into the "real world" and no causal inference is possible
Data granularity

12 / 17

Counterfactuals and observed data

"Hypothetical intervention" must be linked to actually observed version of treatment, otherwise mathematical notation of consistency $P r [Y^{a = 1} | A = 1] = P r [Y = 1 | A = 1]$ cannot be translated into the "real world" and no causal inference is possible
Data granularity
When dealing with treatments with multiple versions --> assuming treatment variation irrelevance

12 / 17

Counterfactuals and observed data

"Hypothetical intervention" must be linked to actually observed version of treatment, otherwise mathematical notation of consistency $P r [Y^{a = 1} | A = 1] = P r [Y = 1 | A = 1]$ cannot be translated into the "real world" and no causal inference is possible
Data granularity
When dealing with treatments with multiple versions --> assuming treatment variation irrelevance
Transparency

12 / 17

Target trialCausal effecta contrast between average counterfactual outcomes under different treatment values

13 / 17

Target trial

Causal effect
- a contrast between average counterfactual outcomes under different treatment values
Imagine a (hypothetical) randomized experiment to quantify it

13 / 17

Target trial

Causal effect
- a contrast between average counterfactual outcomes under different treatment values
Imagine a (hypothetical) randomized experiment to quantify it
Components of the "protocol"
- Eligibility criteria
- Interventions (or treatment strategies)
- Outcome(s)
- Follow-up
- Causal contrast
- Statistical analysis

13 / 17

Target trial

Causal effect
- a contrast between average counterfactual outcomes under different treatment values
Imagine a (hypothetical) randomized experiment to quantify it
Components of the "protocol"
- Eligibility criteria
- Interventions (or treatment strategies)
- Outcome(s)
- Follow-up
- Causal contrast
- Statistical analysis
Oversimplified analysis example
- Contrasting the risk of death in obese vs non-obese individuals means emulating a target trial in which obese individuals are instantaneously become to non-obese

13 / 17

Causal effect, Pearl 2018

Does Obesity Shorten Life? Or is it the Soda? On Non-manipulable Causes
Factors not fitting into experimentalist concept of causation

44% of U.S. adults could have been obese by 2030, compared to 35.7% in 2012

$66 billion a year in obesity-related medical costs
New York City adopted a regulation banning the sale of sugary drinks in containers larger than 16 ounces at restaurants
Is obesity well-defined and "obesity-related medical costs" exist?

14 / 17

Arguments against obesity as an exposureBMI used to measure degree of obesity is a proxy for it
15 / 17

Arguments against obesity as an exposure

BMI used to measure degree of obesity is a proxy for it
Obesity is not well-defined "intervention" --> consistency does not hold

15 / 17

Arguments against obesity as an exposure

BMI used to measure degree of obesity is a proxy for it
Obesity is not well-defined "intervention" --> consistency does not hold
Ill-defined intervention may undermine the exchangeability logic
- If we cannot define the exposure, we cannot define what may confound its effect on the outcome

15 / 17

Arguments against obesity as an exposure

BMI used to measure degree of obesity is a proxy for it
Obesity is not well-defined "intervention" --> consistency does not hold
Ill-defined intervention may undermine the exchangeability logic
- If we cannot define the exposure, we cannot define what may confound its effect on the outcome
Ill-defined intervention may may threaten positivity
- Restricting data to confounders, within whose levels the positivity holds may result in a population different from the original one

15 / 17

Arguments for obesity as an exposure (in short)Practical interventions may have side effects --> yet, are deemed well-defined
16 / 17

Arguments for obesity as an exposure (in short)

Practical interventions may have side effects --> yet, are deemed well-defined
Root of obesity being ill-defined "intervention"
- consequences of obesity depend on how we "manipulate" it

16 / 17

Arguments for obesity as an exposure (in short)

Practical interventions may have side effects --> yet, are deemed well-defined
Root of obesity being ill-defined "intervention"
- consequences of obesity depend on how we "manipulate" it
At the same time, the quantity $P r (m o r t a l i t y | d o (o b e s i t y))$ (notation of PO using do-operator; means the same as $P r [m o r t a l i t y^{o b e s i t y = 1} | o b e s i t y = 1]$ ) describes an intervention set by nature (via complex processes)

16 / 17

Arguments for obesity as an exposure (in short)

Practical interventions may have side effects --> yet, are deemed well-defined
Root of obesity being ill-defined "intervention"
- consequences of obesity depend on how we "manipulate" it
At the same time, the quantity $P r (m o r t a l i t y | d o (o b e s i t y))$ (notation of PO using do-operator; means the same as $P r [m o r t a l i t y^{o b e s i t y = 1} | o b e s i t y = 1]$ ) describes an intervention set by nature (via complex processes)
Causal effects of anatomical/physiological conditions may be described in terms of their presence/absence not necessarily via the means they can be manipulated

16 / 17

Take home messagesDefine causal question
Define the exposure. Does it it have one version or several? Inference still possible?
Can conditional exchangeability be reached given current domain knowledge?
Is prediction a better target when exposure cannot be sufficiently well-defined?
17 / 17

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help

Observational studies

What If: Chapter 3

Elena Dudukina

2020-11-12

Observational studies as conditionally randomized experiments

Instrumental variables

Exchageability

Conditionally randomized experiment

Expert knowledge

Positivity

Consistency

Well-defined intervention paradigm

Counterfactuals and observed data

Counterfactuals and observed data

Counterfactuals and observed data

Counterfactuals and observed data

Target trial

Target trial

Target trial

Target trial

Causal effect, Pearl 2018

Arguments against obesity as an exposure

Arguments against obesity as an exposure

Arguments against obesity as an exposure

Arguments against obesity as an exposure

Arguments for obesity as an exposure (in short)

Arguments for obesity as an exposure (in short)

Arguments for obesity as an exposure (in short)

Arguments for obesity as an exposure (in short)

Take home messages

Observational studies as conditionally randomized experiments

Help