Confounding
A confounding variable is an extraneous variable in a statistical model that correlates with both the dependent variable and the independent variable.
Learning Objective

Break down why confounding variables may lead to bias and spurious relationships and what can be done to avoid these phenomenons.
Key Points
 A perceived relationship between an independent variable and a dependent variable that has been misestimated due to the failure to account for a confounding factor is termed a spurious relationship.
 Confounding by indication  the most important limitation of observational studies  occurs when prognostic factors cause bias, such as biased estimates of treatment effects in medical trials.
 Confounding variables may also be categorised according to their source: such as operational confounds, procedural confounds or person confounds.
 A reduction in the potential for the occurrence and effect of confounding factors can be obtained by increasing the types and numbers of comparisons performed in an analysis.
 Moreover, depending on the type of study design in place, there are various ways to modify that design to actively exclude or control confounding variables.
Terms

peer review
the scholarly process whereby manuscripts intended to be published in an academic journal are reviewed by independent researchers (referees) to evaluate the contribution, i.e. the importance, novelty and accuracy of the manuscript's contents

placebo effect
the tendency of any medication or treatment, even an inert or ineffective one, to exhibit results simply because the recipient believes that it will work

prognostic
a sign by which a future event may be known or foretold

confounding variable
an extraneous variable in a statistical model that correlates (positively or negatively) with both the dependent variable and the independent variable
Example
 In risk assessments, factors such as age, gender, and educational levels often have impact on health status and so should be controlled. Beyond these factors, researchers may not consider or have access to data on other causal factors. An example is on the study of smoking tobacco on human health. Smoking, drinking alcohol, and diet are lifestyle activities that are related. A risk assessment that looks at the effects of smoking but does not control for alcohol consumption or diet may overestimate the risk of smoking. Smoking and confounding are reviewed in occupational risk assessments such as the safety of coal mining. When there is not a large sample population of nonsmokers or nondrinkers in a particular occupation, the risk assessment may be biased towards finding a negative effect on health.
Full Text
Confounding Variables
A confounding variable is an extraneous variable in a statistical model that correlates (positively or negatively) with both the dependent variable and the independent variable. A perceived relationship between an independent variable and a dependent variable that has been misestimated due to the failure to account for a confounding factor is termed a spurious relationship, and the presence of misestimation for this reason is termed omittedvariable bias.
As an example, suppose that there is a statistical relationship between ice cream consumption and number of drowning deaths for a given period. These two variables have a positive correlation with each other. An individual might attempt to explain this correlation by inferring a causal relationship between the two variables (either that ice cream causes drowning, or that drowning causes ice cream consumption). However, a more likely explanation is that the relationship between ice cream consumption and drowning is spurious and that a third, confounding, variable (the season) influences both variables: during the summer, warmer temperatures lead to increased ice cream consumption as well as more people swimming and, thus, more drowning deaths.
Types of Confounding
Confounding by indication has been described as the most important limitation of observational studies. Confounding by indication occurs when prognostic factors cause bias, such as biased estimates of treatment effects in medical trials. Controlling for known prognostic factors may reduce this problem, but it is always possible that a forgotten or unknown factor was not included or that factors interact complexly. Randomized trials tend to reduce the effects of confounding by indication due to random assignment.
Confounding variables may also be categorised according to their source:
 The choice of measurement instrument (operational confound)  This type of confound occurs when a measure designed to assess a particular construct inadvertently measures something else as well.
 Situational characteristics (procedural confound)  This type of confound occurs when the researcher mistakenly allows another variable to change along with the manipulated independent variable.
 Interindividual differences (person confound)  This type of confound occurs when two or more groups of units are analyzed together (e.g., workers from different occupations) despite varying according to one or more other (observed or unobserved) characteristics (e.g., gender).
Decreasing the Potential for Confounding
A reduction in the potential for the occurrence and effect of confounding factors can be obtained by increasing the types and numbers of comparisons performed in an analysis. If a relationship holds among different subgroups of analyzed units, confounding may be less likely. That said, if measures or manipulations of core constructs are confounded (i.e., operational or procedural confounds exist), subgroup analysis may not reveal problems in the analysis.
Peer review is a process that can assist in reducing instances of confounding, either before study implementation or after analysis has occurred. Similarly, study replication can test for the robustness of findings from one study under alternative testing conditions or alternative analyses (e.g., controlling for potential confounds not identified in the initial study). Also, confounding effects may be less likely to occur and act similarly at multiple times and locations.
Moreover, depending on the type of study design in place, there are various ways to modify that design to actively exclude or control confounding variables:
 Casecontrol studies assign confounders to both groups, cases and controls, equally. In casecontrol studies, matched variables most often are age and sex.
 In cohort studies, a degree of matching is also possible, and it is often done by only admitting certain age groups or a certain sex into the study population. this creates a cohort of people who share similar characteristics; thus, all cohorts are comparable in regard to the possible confounding variable.
 Double blinding conceals the experiment group membership of the participants from the trial population and the observers. By preventing the participants from knowing if they are receiving treatment or not, the placebo effect should be the same for the control and treatment groups. By preventing the observers from knowing of their membership, there should be no bias from researchers treating the groups differently or from interpreting the outcomes differently.
 A randomized controlled trial is a method where the study population is divided randomly in order to mitigate the chances of selfselection by participants or bias by the study designers. Before the experiment begins, the testers will assign the members of the participant pool to their groups (control, intervention, parallel) using a randomization process such as the use of a random number generator.
Key Term Reference
 bias
 Appears in these related concepts: Interpreting Distributions Constructed by Others, Culture Bias, and Chance Error and Bias
 confounding
 Appears in these related concepts: Random Assignment of Subjects, tTest for Two Samples: Paired, and Line fitting, residuals, and correlation exercises
 control
 Appears in these related concepts: Experiments, Using a Bank for Control, and Internal and External
 correlation
 Appears in these related concepts: Benefits of Globalization, Controlling for a Variable, and Descriptive and Correlational Statistics
 datum
 Appears in these related concepts: Change of Scale, Lab 1: Confidence Interval (Home Costs), and Type I and II Errors
 dependent variable
 Appears in these related concepts: Graphical Representations of Functions, Converting between Exponential and Logarithmic Equations, and What is a Quadratic Function?
 experiment
 Appears in these related concepts: Introduction to Bivariate Data, One and TwoTailed Tests, and Primary Market Research
 factor
 Appears in these related concepts: Rational Algebraic Expressions, Factors, and Finding Factors of Polynomials
 independent
 Appears in these related concepts: Fundamentals of Probability, Unions and Intersections, and Party Identification
 independent variable
 Appears in these related concepts: Experimental Design, The Cartesian System, and Experimental Research
 level
 Appears in these related concepts: Factorial Experiments: Two Factors, Statistical Controls, and Randomized Design: SingleFactor
 observational study
 Appears in these related concepts: What are Observational Studies?, Case study: gender discrimination exercises, and One sample t tests
 placebo
 Appears in these related concepts: Surveys or Experiments?, Introduction to Biomedical Therapies, and The Clofibrate Trial
 population
 Appears in these related concepts: Applications of Statistics, The Functionalist Perspective on Deviance, and Quorum Sensing
 random assignment
 Appears in these related concepts: The Portacaval Shunt, Two Regression Lines, and ANOVA Design
 random number
 Appears in these related concepts: Random Samples, Are Real Dice Fair?, and Lab 2: Central Limit Theorem (Cookie Recipes)
 sample
 Appears in these related concepts: Identifying Product Benefits, Surveys, and Basic Inferential Statistics
 variable
 Appears in these related concepts: What is a Linear Function?, Math Review, and Introduction to Variables
Sources
Boundless vets and curates highquality, openly licensed content from around the Internet. This particular resource used the following sources: