9/28/23
Slides: afranks.com/talks
Sensitivity to Unobserved Confounding in Studies with Factor-structured Outcomes, (JASA, 2023) https://arxiv.org/abs/2208.06552
Joint work with Jiajing Zheng (formerly UCSB), Jiaxi Wu (UCSB) and Alex D’Amour (Google)
Consider a treatment \(T\) and outcome \(Y\)
Interested in the population average treatment effect (PATE) of \(T\) on \(Y\): \[E[Y | do(T=t)] - E[Y | do(T=t')]\]
Observational data from the National Health and Nutrition Examination Study (NHANES) on alcohol consumption.
Light alcohol consumption is positively correlated with blood levels of HDL (“good cholesterol”)
Define “light alcohol consumption’’ as 1-2 alcoholic beverages per day
Non-drinkers: self-reported drinking of one drink a week or less
Control for age, gender and indicator for educational attainment
Call:
lm(formula = Y[, "HDL"] ~ drinking + X)
Residuals:
Min 1Q Median 3Q Max
-5.0855 -0.6127 -0.0512 0.6389 4.2383
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.225550 0.091105 2.476 0.013412 *
drinking 0.597399 0.091917 6.499 1.11e-10 ***
Xage 0.006409 0.001452 4.415 1.09e-05 ***
Xgender 0.689557 0.049426 13.951 < 2e-16 ***
Xeduc 0.194338 0.051161 3.799 0.000152 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.9216 on 1434 degrees of freedom
Multiple R-squared: 0.1531, Adjusted R-squared: 0.1507
F-statistic: 64.81 on 4 and 1434 DF, p-value: < 2.2e-16
Call:
lm(formula = Y[, "Methylmercury"] ~ drinking + X)
Residuals:
Min 1Q Median 3Q Max
-2.3570 -0.7363 -0.0728 0.6242 4.1127
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.442044 0.096385 4.586 4.91e-06 ***
drinking 0.364096 0.097244 3.744 0.000188 ***
Xage 0.008186 0.001536 5.330 1.14e-07 ***
Xgender -0.062664 0.052290 -1.198 0.230966
Xeduc 0.269815 0.054126 4.985 6.95e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.975 on 1434 degrees of freedom
Multiple R-squared: 0.05209, Adjusted R-squared: 0.04945
F-statistic: 19.7 on 4 and 1434 DF, p-value: 8.41e-16
Pearson's product-moment correlation
data: hdl_fit$residuals and mercury_fit$residuals
t = 3.7569, df = 1437, p-value = 0.0001789
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.04718758 0.14953581
sample estimates:
cor
0.0986225
Residual correlation might be indicative of confounding bias
NUC unlikely to hold exactly. What then?
Calibrate assumptions about confounding to explore range of causal effects that are plausible
Robustness: quantify how “strong” confounding has to be to nullify causal effect estimates
How do results change when we assume a priori that certain outcomes cannot be affected by treatments?
\(R^2_{T\sim U|X}\): partial fraction of treatment variance explained by confounders (given observed covariates)
\(R^2_{Y\sim U|T,X}\): partial fraction of outcome variance explained by confounders (given observed covariates and treatment)
What values of \(R^2_{Y\sim U|T, X}\) and \(R^2_{T \sim U | X}\) might be reasonable?
Can use observed covariates to generate benchmark values:
Compute \(R^2_{T \sim X_{j} | X_{-j}}\) for all covariate \(X_j\)
Compute \(R^2_{Y \sim X_{j} | X_{-j}, T}\) for all covariate \(X_j\)
Use domain knowledge to reason about most important confounders
From the sensemakr
documentation (Cinelli, Ferwerda, and Hazlett 2020)
Assume the observed data mean and covariance can be expressed as follows: \[\begin{align}
E[Y \mid T = t, X=x] &= \check g(t, x)\\
Cov(Y \mid T = t, X = x) &= \Gamma\Gamma' + \Lambda,
\end{align}\]
\(U\) (m-vector) and \(X\) are possible causes for \(T\) (scalar) and \(Y\) (q-vector)
\(X\) are observed but \(U\) are not.
\[\begin{align} & U = \epsilon_U \label{eqn:u}\\ &T = f_{\epsilon}(X, U) \label{eqn:treatment_general,multi-y}\\ &Y = g(T,X) + \Gamma\Sigma_{u|t,x}^{-1/2}U + \epsilon_{y}, \label{eqn:epsilon_y} \end{align}\]
\[\begin{align} &U = \epsilon_U\\ &T = f_\epsilon(X,U)\\ &Y = g(T, X) + \Gamma\Sigma_{u|t,x}^{-1/2}U + \epsilon_{y} \end{align}\]
Confounding bias is \(\Gamma\Sigma_{u|t,x}^{-1/2}\mu_{u \mid t,x}\)
\(\mu_{u \mid t,x}\) and \(\Sigma_{u|t,x}\) are the conditional mean and covariance of the unmeasured confounders
\[\begin{align} \mu_{u\mid t,x} &= \frac{\rho}{\sigma_{t \mid x}^{2}}\left(t-\mu_{t\mid x}\right) \label{eqn:conditional_u_mean}, \\ \Sigma_{u \mid t,x} &= I_m-\frac{\rho \rho^{\prime}}{\sigma_{t\mid x}^{2}} \label{eqn:conditional_u_cov}, \end{align}\]
\(\rho\) is the partial correlation vector between \(T\) and \(U\)
Define \(0 \leq R^2_{T \sim U |X}:= \frac{||\rho||^2_2}{\sigma^2_{t\mid x}} < 1\) to be the squared norm of the partial correlation between T and U given \(X\)
To identify factor loadings, \(\Gamma\), \((q-m)^2-q-m\geq0\) and each confounder must influence at least three outcomes
The bound on the bias for outcome \(j\) is proportional to the norm of the factor loadings for that outcome
A single sensitivity parameter, \(R^2_{T \sim U \mid X}\), shared across all outcomes
The bound on the bias for outcome \(j\) is proportional to the norm of the factor loadings for that outcome
A single sensitivity parameter, \(R^2_{T \sim U \mid X}\), shared across all outcomes
The bound on the bias for outcome \(j\) is proportional to the norm of the factor loadings for that outcome
A single sensitivity parameter, \(R^2_{T \sim U \mid X}\), shared across all outcomes
\(R^2_{T \sim U | X}\) is unnatural for binary treatments
\(\Lambda\)-parameterization \(\leftrightarrow\) \(R^2_{T \sim U | X}\)-parameterization
Fix a \(\Lambda_\alpha\) such that \[Pr\left(\Lambda_\alpha^{-1} \leq \frac{e_0(X, U)/(1-e_0(X, U))}{e(X)/(1-e(X))}\leq \Lambda_\alpha\right)=1-\alpha\]
\(RV^\Gamma_j\) can be smaller or larger than \(RV^1\)
\(RV_j^{\Gamma} \geq XRV\) by definition
Gaussian data generating process \[\begin{align} T &= \beta' U + \epsilon_T \\ Y_j &= \tau_jT + \Gamma'\Sigma^{-1/2}_{u|t}U + \epsilon_y \end{align}\]
\(R^2_{T \sim U \mid X}=0.5\) from \(m=2\) unmeasured confounders
\(\tau_j = 0\) for \(Y_1\), \(Y_2\) and \(Y_{10}\)
\(\tau_j=1\) for all outher outcomes
Fit a Bayesian linear regression on the 10 outcomes given then treatment
Assume a residual covariance with a rank-two factor structure
Plot ignorance regions assuming \(R^2_{T \sim U} \leq 0.5\)
Plot ignorance regions assuming \(R^2_{T \sim U} \leq 0.5\) and \(Y_1\) is null
Model: \[\begin{align} Y &\sim N(\tau T + \alpha 'X, \Gamma\Gamma' + \Lambda) \end{align}\]
\(E[Y| T, X, U] = \tau T + \alpha 'X + \Gamma'\Sigma^{-1/2}_{u|t}U\)
Residuals are approximately Gaussian
Fit a multivariate Bayesian linear regression with factor structured residuals on all outcomes
Use age, gender and an indicator of educational attainment to benchmark
\(\frac{1}{3.5} \leq \text{Odds}(X)/\text{Odds}(X_{-age}) \leq 3.5\) for 95% of observed values
For gender and education indicators the odds change was between \(\frac{1}{1.5}\) and \(1.5\)
Assume light drinking has no effect on methylmercury levels
Prior knowledge unique to the multi-outcome setting can help inform assumptions about confounding
Sharper sensitivity analysis, when assumptions hold
Negative control assumptions can potentially provide strong evidence for or against robustness
Jiaxi Wu (top, UCSB)
Jiajing Zheng (middle, formerly UCSB)
Alex D’Amour (bottom, Google Research)
Sensitivity to Unobserved Confounding in Studies with Factor-structured Outcomes https://arxiv.org/abs/2208.06552