1 General Purpose
If in a parallel-group trial the
patient characteristics are equally distributed between the two
treatment groups, then any difference in outcome can be attributed
to the different effects of the treatments. However, if not, we
have a problem. The difference between the treatment groups may be
due, not only to the treatments given, but also to differences in
characteristics between the two treatment groups. The latter
differences are called confounders or confounding variables.
Assessment for confounding is explained.
2 Schematic Overview of Type of Data File

3 Primary Scientific Question
Is one treatment better than the other
in spite of confounding in the study.
4 Data Example
A 40 patient parallel group study
assesses the efficacy of a sleeping pill versus placebo. We suspect
that confounding may be in the data: the females may have received
the placebo more often than the males.
Outcome
|
Treat
|
Gender
|
3,49
|
0,00
|
0,00
|
3,51
|
0,00
|
0,00
|
3,50
|
0,00
|
0,00
|
3,51
|
0,00
|
0,00
|
3,49
|
0,00
|
0,00
|
3,50
|
0,00
|
0,00
|
3,51
|
0,00
|
0,00
|
3,49
|
0,00
|
0,00
|
3,50
|
0,00
|
0,00
|
3,49
|
0,00
|
0,00
|
The first 10 patients of the 40 patient
study are given above. The entire data file is in
extras.springer.com, and is entitled “chapter22confounding”. Start
by opening the data file in SPSS.
5 Some Graphs of the Data
We will then draw the mean results of
the treatment modalities with their error bars.
Command:
-
Graphs….Legacy dialogs.…Error Bars.…mark Summaries for groups of cases.…Define.…Variable: hoursofsleep.…Category Axis; treat.…Confidence Interval for Means: 95 %....click OK.

The above graph shows that the
treatment 1 tended to perform a bit better than treatment 0, but,
given the confidence intervals (95 % CIs), the difference is
not significantly different. Females tend to sleep better than
males, and we suspect that confounding may be in the data: the
females may have received the placebo more often than the males.
We, therefore, draw a graph with mean treatment results in the
genders.
Command:
-
Graphs….Legacy dialogs….Error Bars….mark Summaries for groups of cases .…Define.…Variable: hoursofsleep….Category Axis: gender….Confidence Interval for Means: 95 %....click OK.

The graph shows that the females tend
to perform better than the males. However, again the confidence
intervals are wider than compatible with a statistically
significant difference. We will, subsequently, perform simple
linear regressions with respectively treatment modality and gender
as predictors.
6 Linear Regression Analyses
For analysis the statistical model
Linear in the module Regression is required.
Command:
-
Analyze….Regression….Linear….Dependent: hoursofsleep….Independent: treatment modality….click OK.
Coefficientsa
Model
|
Unstandardized coefficients
|
Standardized coefficients
|
t
|
Sig.
|
||
---|---|---|---|---|---|---|
B
|
Std. error
|
Beta
|
||||
1
|
(Constant)
|
3,495
|
,004
|
918,743
|
,000
|
|
Treatment
|
,010
|
,005
|
,302
|
1,952
|
,058
|
The above table shows that treatment
modality is not a significant predictor of the outcome at
p < 0,050.
We will also use linear regression
with gender as predictor and the same outcome variable.
Command:
-
Analyze….Regression….Linear….Dependent: hoursofsleep….Independent: gender….click OK.
Coefficientsa
Model
|
Unstandardized coefficients
|
Standardized coefficient
|
t
|
Sig.
|
||
---|---|---|---|---|---|---|
B
|
Std. error
|
Beta
|
||||
1
|
(Constant)
|
3,505
|
,004
|
921,504
|
,000
|
|
Gender
|
−,010
|
,005
|
−,302
|
−1,952
|
,058
|
Also gender is not a significant
predictor of the outcome, hours of sleep at p < 0,050.
Confounding between treatment modality and gender is suspected. We
will perform a multiple linear regression with both treatment
modality and gender as independent variables.
Command:
-
Analyze….Regression….Linear….Dependent: hoursofsleep….Independent: treatment modality, gender….click OK.
Coefficientsa
Model
|
Unstandardized coefficients
|
Standardized coefficients
|
t
|
Sig.
|
||
---|---|---|---|---|---|---|
B
|
Std. error
|
Beta
|
||||
1
|
(Constant)
|
3,500
|
,003
|
1005,280
|
,000
|
|
Gender
|
−,021
|
,005
|
−,604
|
−3,990
|
,000
|
|
Treatment
|
,021
|
,005
|
,604
|
3,990
|
,000
|
The above table shows, that, indeed,
both gender and treatment are very significant predictors of the
outcome after adjustment for one another.

The above figure tries to explain what
is going on. If one gender receives few treatments 0 and the other
gender receives few treatments 1, then an overall regression line
will be close to horizontal, giving rise to the erroneous
conclusion that no difference in the treatment efficacy exists
between the treatment modalities.
This phenomenon is called confounding,
and can be dealt with in several ways: (1) subclassification
(Statistics on a Pocket Calculator, Part 1, Chapter 17, Springer
New York, 2011, from the same authors), (2) propensity scores and
propensity score matching (Statistics on a Pocket Calculator, Part
2, Chapter 5, Springer New York, 2012, from the same authors), and
(3) multiple linear regression as performed in this chapter. If
there are multiple confounders like the traditional risk factors
for cardiovascular disease, then multiple linear regression is
impossible, because with many confounders this method loses power.
Instead, propensity scores of the confounders can be constructed,
one propensity score per patient, and the individual propensity
scores can be used as covariate in a multiple regression model
(Statistics on a Pocket Calculator, Part 2, Chapter 5, Springer New
York, 2012, from the same authors).
7 Conclusion
If in a parallel-group trial the
patient characteristics are equally distributed between the two
treatment groups, then any difference in outcome can be attributed
to the different effects of the treatments. However, if not, we
have a problem. The difference between the treatment groups may be
due, not only to the treatments given but, also to differences in
characteristics between the two treatment groups. The latter
differences are called confounders or confounding variables.
Assessment for confounding is explained.
8 Note
More background, theoretical, and
mathematical information is available in Statistics applied to
clinical studies 5th edition, Chap. 28, Springer Heidelberg
Germany, 2012, from the same authors.