Michal Abrahamowicz

Unmeasured Confounding, the Achilles heel of Observational Epidemiological Research: Challenges and New Methods



Modern epidemiology increasingly relies on large population-based health databases. An example of paramount importance, for both public health and the society-at-large, involves observational pharmacoepidemiological database studies of drug safety. Yet, while large databases are essential to ensure detecting rare but serious adverse drug effects, they typically do not record several patient characteristics which may affect both the treatment choice and the health outcome of interest [1]. The resulting bias, due to unmeasured confounding by indication, is considered the 'Achilles heel' of modern pharmacoepidemiology, and developing new, validated analytical methods to reduce its impact is the top research priority [1].  


To propose two new methods for reducing the impact of unobserved confounding in pharmacoepidemiology, validate both methods in simulations, and apply them in real-life drug safety studies.


The complexity of the unmeasured confounding problem requires alternative methods for different data structures. We propose two new methods, each applicable in a different situation. The first 'missing cause' method does not require any data on, or even identification of, potential unmeasured confounders; and represents an alternative to prescribing preferences-based instrumental variable (IV) approach [2]. This method relies on discrepancies between a) treatment actually received by individual patients versus b) treatment they would be expected to receive, given patients’ measured characteristics and their physicians’ prescribing preferences [3]. We use the treatment-by-discrepancy interaction to: (i) test for presence of unmeasured confounding and (ii) correct the treatment effect estimate for the resulting bias [3]. The second 'martingale residual' (MR) method applies in time-to-event analyses where additional confounders, unavailable in the main database, are measured in a smaller ‘validation subsample’ (VS) [4]. This method, proposed by Burne and Abrahamowicz, imputes unmeasured confounders for all subjects in the main database, based on their relationships with exposure, measured confounders, and outcome (approximated through the MR), estimated in the VS [5].


Simulations validated both methods and demonstrated their potential benefits [3,5]. Our missing cause estimates had much smaller bias than conventional estimates, much smaller variance than IV’s estimates [2], and best overall accuracy [3]. Similarly, the MR-based imputation yielded practically unbiased estimates which were generally more accurate than Propensity Score Calibration [4], conventional multivariable analyses, or standard imputation [5]. In real-life application, the missing cause estimates showed a slight reduction of gastrointestinal risks for COX-2 inhibitors compared to traditional NSAIDs, in contrast to a risk increase suggested by IV estimates, with much wider CI’s [3]. MR-based imputation of additional confounders, from an external VS, suggested that increased risk of diabetes associated with oral glucocorticoids use, may be statistically non-significant and much lower than that estimated by conventional Cox model [5].


Our new methods may enhance the accuracy and validity of some large database studies of drug safety.

[1] Avorn J. NEJM 2007; 357:2219.

[2] Brookhart MA, et al. Epidemiology 2006; 17:268-75.

[3] Abrahamowicz M, et al. Statistics- in-Medicine. 2016; 35:1001-16.

[4] Stϋrmer T, et al. American J Epidemiology 2005; 162:279-89.

[5] Burne R & Abrahamowicz M. Martingale residual-based method .... Statistics-in-Medicine 2016 (In Press).