Why matched case control




















Table 4. The findings are consistent when the mean age difference is Table 5. Table 6. In conclusion, unconditional and conditional logistic regression models perform similarly in testing and estimation except when the age distributions of exposed and unexposed subjects are 20 years apart.

When the two age distributions are 20 years apart, the unconditional model consistently gives a type I error below the acceptable range and is slightly less powerful than the conditional model under the alternative hypothesis. When the alternative hypothesis is true, the unconditional model significantly underestimates the effect of exposure while the conditional model consistently produces an unbiased estimate.

When the mean age of exposed subjects is 20 years older than that of unexposed subjects, cases are more likely to be matched to controls with the same exposure status and the association is diminished accordingly. The unconditional method ignores matching but adjusts for confounding in the framework of regression.

In general, the Mantel—Haenszel estimator and the logit-based estimator are similar when the data within strata, here age groups, are not too sparse Without losing generalizability, assume that age is grouped into a few age groups. Denoted by a, b, c , and d , the four cell counts representing the numbers of exposed cases, exposed controls, unexposed cases, and unexposed controls, respectively.

The Mantel—Haenszel odds ratio is given by. The top and bottom age groups particularly have the ratio of number of cases to number of controls given the exposure status close to the case—control matching ratio.

The addition from a particular age group to the numerator and the denominator tend to be similar, which drives the association toward the null value. Through simulations, we assumed well-powered studies, and every case can be matched to a control, which is reasonable because the question that we attempt to address is whether a matched case—control data need to be analyzed by conditional logistic regression model.

For a sufficiently large sample size regardless of disease prevalence and exposure frequency, our conclusions are generalizable for other disease prevalence and exposure frequency. Again, the objective of this article is to compare the two methods given a matched case—control data instead of unmatched and matched data from different study designs where matched data tend to have a smaller sample size due to unmatched cases.

Our findings suggest that when cases and controls are matched on age only, the data are essentially loose-matching data, and unconditional logistic regression is a proper method when the age distributions of exposed and unexposed subjects are not significantly apart. Previous literature has provided in-depth discussion about the advantages of unconditional regression model compared to its conditional alternative, such as convenience, easy to access, straightforward interpretation, and the potential to preserve unmatched controls We argue that matched case—control studies have been underappreciated by the misconception that matched case—control data can be analyzed only by matched methods.

A paper reviewed statistical methods of 37 matched case—control studies published in Among these studies, a majority of them performed matching on demographic variables namely age and sex only. The conclusion was made as the authors claimed following the book of Breslow et al. Based on our findings, matched methods are not necessary for loose-matching data, e. While we believe that it is realistically rare to observe two age distributions that are 20 years apart for exposed and unexposed subjects, it gives us an example how the matching distortion matched cases and controls tend to share the same exposure status fails the unconditional logistic regression model.

In contrast, the matching distortion was corrected by including the matching variables in the conditional logistic regression model 12 , Although we only considered a single matching variable, i.

With an increasing number of matching variables, loose matching is less likely to hold in the data, e. However, the strength of loose matching is not always reflected from the number of matching variables.

Matching on neighborhood or matching based on relationships implicitly matches numerous unmeasured variables including unmeasurable variables. Such studies apparently generate genuinely matched data that need to be analyzed by matched methods. It should be cautioned that our findings are for matched case—control data and cannot be generalized for propensity score PS matched data. PS method was developed to facilitate causal inference in the spirit of clinical trials Matching in PS method is performed on the probability of a treatment assignment, which is determined by a selection of variables including confounders.

After controlling for these variables, it is assumed that the outcome is independent of treatment status. The study is typically a cohort study, and the purpose of PS matching is to ensure that the treatment groups are balanced with respect to the variables conditional independence.

In contrast, case—control studies are retrospective studies, and the exposure status is observed. While there is a debate about whether treated and untreated samples should be regarded as independent, which will inform the choice of statistical methods 17 , it is different from the question that we have tried to address in terms of study design and matching scheme.

The scope of this study is limited to case—control studies that perform matching on a few demographic variables and consider methods of unconditional and conditional logistic regression models. In addition, the simulation settings assume absolute matching success, no model misspecification, and no interaction between exposure and matching variables.

However, these assumptions can be relaxed and will require further investigation. The results by a linear regression model unmatched method and a linear mixed effects model assuming random effects for matching sets matched method were quite similar in terms of regression coefficient and P value associated with the case—control status, which supports our finding that case—control data matched on a few demographic variables can be properly analyzed by unmatched methods.

To conclude, it has been known that matched methods, e. Matched methods additionally are robust to the matching distortion. Unmatched methods, e. When the study design involves other complex features such as censoring and repeated measures, matching on a few demographic variables can be ignored if the confounding effect is not very large.

Standard methods such as Cox regression and generalized estimating equation then can be readily applied. Unmatched methods also are appealing for saving computational time when the same analysis needs to be repeated extensively, e.

In addition to matching, other factors also need to be considered, such as study design and practical feasibility when choosing a statistical method. All of the authors contributed significantly to study design, result interpretation, and manuscript preparation. The data simulations were conducted by C-LK.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. Estimation of multiple relative risk functions in matched case-control studies. Am J Epidemiol 4 — Costanza MC. Prev Med 24 5 — Matching in epidemiologic studies: validity and efficiency considerations.

Biometrics 37 2 — Miettinen OS. Estimation of relative risk from individually matched series. Biometrics 26 1 — Selection of controls in case-control studies. Types of controls. PubMed Google Scholar. Greenland S, Lash TL. Bias analysis. Philadelphia: Lippincott Williams and Wilkins; A structural approach to selection bias.

Gail MH. Selection bias. In: Armitage P, Colton T, editors. Encyclopedia of biostatistics. Matched designs and causal diagrams. Int J Epidemiol. Matching and confounding in the design and analysis of epidemiological case-control studies. Perspectives in medical statistics. New York: Academic Press; Matching in epidemiologic studies: validity and efficiency considerations.

Samuels ML. Matching and design efficiency in epidemiological studies. Article Google Scholar. Thomas DC, Greenland S. The relative efficiencies of matched and independent sample designs for case-control studies.

J Chronic Dis. The design of case-control studies: the influence of confounding and interaction effects. The efficiency of matching in case-control studies of risk-factor interactions.

Estimating variances of standardized estimators in case-control studies and sparse data. Greenland S, Rothman KJ. Introduction to stratified analysis. Clayton D, Hills M. Statistical models in epidemiology, chapter New York: Oxford University Press; Re: Estimating relative risk functions in case-control studies using a nonparametric logistic regression.

Am J Epidemiol. Multiplicative models and cohort analysis. J Am Stat Assoc. Introduction to regression modeling. Applications of stratified analysis methods. Statistics for epidemiology, chapter Some surprising results about covariate adjustment in logistic regression.

Int Stat Rev. Matched case-control designs and overmatched analyses. Quantifying biases in causal models: classical confounding vs collider-stratification bias. On the use of graphical models for inference under outcome dependent sampling.

Stat Sci. Kalish LA. Matching on a non-risk factor in the design of case-control studies does not always result in an efficiency loss. Inverse probability weighting. Effect of physical activity on functional performance and knee pain in patients with osteoarthritis: analysis with marginal structural models.

Szklo M, Nieto F. Epidemiology: beyond the basics, chapter 6. Sudbury: Jones and Bartlett Publishers; Tests for interaction in epidemiologic studies: a review and a study of power. Stat Med.

Greenland S, Maldonado G. The interpretation of multiplicative model parameters as standardized parameters. The statistics below the table are automatically calculated and provide a basis for determining the relationship between the exposure element and a patient becoming ill.

The odds ratio is an indicator of the effect of exposure on the likelihood of becoming ill. In this example the odds ratio is 2. In this case, an odds ratio of 2. Exact tests should be used when cell counts are small. Skip directly to site content Skip directly to page options Skip directly to A-Z link.

Section Navigation.



0コメント

  • 1000 / 1000