1 General Purpose
Multivariate analysis is a method that,
simultaneously, assesses more than a single outcome variable. It is
different from repeated measures analysis of variance and mixed
models, that assess both the difference between the outcomes and
the overall effects of the predictors on the outcomes. Multivariate
analysis, simultaneously, assesses the separate effects of the
predictors on one outcome adjusted for the other. E.g., it can
answer clinically important questions like: does drug-compliance
not only predict drug efficacy, but also, independently of the
first effect, predict quality of life. Path statistics can be used
as an alternative approach to multivariate analysis of variance
(MANOVA) (Chap. 17). However, MANOVA is the real
thing, because it produces an overall level of significance of a
predictive model with multiple outcome and predictor
variables.
2 Schematic Overview of Type of Data File

3 Primary Scientific Question
Does the inclusion of additional
outcome variables enable to make better use of predicting
variables.
4 First Data Example
The effects of non compliance and
counseling on treatment efficacy of a new laxative were assessed in
the Chap. 16. For multivariate analysis quality
of life scores were added as additional outcome variable. The first
10 patients of the data file also used in Chap. 17 is given underneath.
Stools
|
Qol
|
Counsel
|
Compliance
|
24,00
|
69,00
|
8,00
|
25,00
|
30,00
|
110,00
|
13,00
|
30,00
|
25,00
|
78,00
|
15,00
|
25,00
|
35,00
|
103,00
|
10,00
|
31,00
|
39,00
|
103,00
|
9,00
|
36,00
|
30,00
|
102,00
|
10,00
|
33,00
|
27,00
|
76,00
|
8,00
|
22,00
|
14,00
|
75,00
|
5,00
|
18,00
|
39,00
|
99,00
|
13,00
|
14,00
|
42,00
|
107,00
|
15,00
|
30,00
|
The entire data file is entitled
“chapter17multivariatewithpath”, and is in extras.springer.com.
Start by opening the data file in SPSS. The module General Linear
Model consists of four statistical models:
-
Univariate,
-
Multivariate,
-
Repeated Measures,
-
Variance Components.
We will use here the statistical model
Multivariate.
We will first assess whether
counseling frequency is a significant predictor of (1) both
frequency improvement of stools and (2) improved quality of life.
Command:
-
Analyze.…General Linear Model.…Multivariate….In dialog box Multivariate: transfer “therapeutic efficacy” and “qol” to Dependent Variables and “counseling” to Fixed factors .…OK.
Multivariant testsa
Effect
|
Value
|
F
|
Hypothesis df
|
Error df
|
Sig.
|
|
---|---|---|---|---|---|---|
Intercept
|
Pillai’s Trace
|
,992
|
1185,131b
|
2,000
|
19,000
|
,000
|
Wilks’ Lambda
|
,008
|
1185,131b
|
2,000
|
19,000
|
,000
|
|
Hotelling’s Trace
|
124,751
|
1185,131b
|
2,000
|
19,000
|
,000
|
|
Roys Largest Root
|
124,751
|
1185,131b
|
2,000
|
19,000
|
,000
|
|
Counseling
|
Pillai’s Trace
|
1,426
|
3,547
|
28,000
|
40,000
|
,000
|
Wilks’ Lambda
|
,067
|
3,894b
|
28,000
|
38,000
|
,000
|
|
Hotelling’s Trace
|
6,598
|
4,242
|
28,000
|
36,000
|
,000
|
|
Roys Largest Root
|
5,172
|
7,389c
|
14,000
|
20,000
|
,000
|
The above table shows that MANOVA can
be considered as another regression model with intercepts and
regression coefficients. Just like analysis of variance (ANOVA) it
is based on normal distributions and homogeneity of the variables.
SPSS has checked the assumptions, and the results as given indicate
that the model is adequate for the data. Generally, Pillai’s method
gives the best robustness and Roy’s the best p-values. We can
conclude that counseling is a strong predictor of both improvement
of stools and improved quality of life. In order to find out which
of the two outcomes is most important, two ANOVAs with each of the
outcomes separately must be performed.
Command:
-
Analyze.…General Linear Model.…Univariate.…In dialog box Univariate transfer “therapeutic efficacy” to Dependent Variables and “counseling” to Fixed Factors.…OK.
-
Do the same for the predictor variable “compliance”.
Tests of between-subjects effects
Source
|
Type III sum of squares
|
df
|
Mean square
|
F
|
Sig.
|
---|---|---|---|---|---|
Corrected model
|
2733,005a
|
14
|
195,215
|
6,033
|
,000
|
Intercept
|
26985,054
|
1
|
26985,054
|
833,944
|
,000
|
Counseling
|
2733,005
|
14
|
195,215
|
6,033
|
,000
|
Error
|
647,167
|
20
|
32,358
|
||
Total
|
36521,000
|
35
|
|||
Corrected total
|
3380,171
|
34
|
Tests of between-subjects effects
Source
|
Type III sum of squares
|
df
|
Mean square
|
F
|
Sig.
|
---|---|---|---|---|---|
Corrected model
|
6833,671a
|
14
|
488,119
|
4,875
|
,001
|
Intercept
|
223864,364
|
1
|
223864,364
|
2235,849
|
,000
|
Counseling
|
6833,671
|
14
|
488,119
|
4,875
|
,001
|
Error
|
2002,500
|
20
|
100,125
|
||
Total
|
300129,000
|
35
|
|||
Corrected total
|
8836,171
|
34
|
The above tables show that also in the
ANOVAs counseling frequency is a strong predictor of not only
improvement of frequency of stools but also of improved quality of
life (improv freq stool = improvement of frequency of stools,
improve qol = improved quality of life scores)
In order to find out whether the
compliance with drug treatment is a contributory predicting factor,
MANOVA with two predictors and two outcomes is performed. Instead
of “counseling” both “counseling” and “compliance” are transfered
to Fixed factors. The underneath table shows the results.
Multivariate testsa
Effect
|
Value
|
F
|
Hypothesis df
|
Error df
|
Sig.
|
|
---|---|---|---|---|---|---|
Intercept
|
Pillai’s Trace
|
,997
|
384,080b
|
1,000
|
1,000
|
,032
|
Wilks’ Lambda
|
,003
|
384,080b
|
1,000
|
1,000
|
,032
|
|
Hotelling’s Trace
|
384,080
|
384,080b
|
1,000
|
1,000
|
,032
|
|
Roy’s Largest Root
|
384,080
|
384,080b
|
1,000
|
1,000
|
,032
|
|
Counseling
|
Pillai’s Trace
|
,933
|
1,392b
|
10,000
|
1,000
|
,583
|
Wilks’ Lambda
|
,067
|
1,392b
|
10,000
|
1,000
|
,583
|
|
Hotelling’s Trace
|
13,923
|
1,392b
|
10,000
|
1,000
|
,583
|
|
Roy’s Largest Root
|
13,923
|
1,392b
|
10,000
|
1,000
|
,583
|
|
Compliance
|
Pillai’s Trace
|
,855
|
,423b
|
14,000
|
1,000
|
,854
|
Wilks’ Lambda
|
,145
|
,423b
|
14,000
|
1,000
|
,854
|
|
Hotelling’s Trace
|
5,917
|
,423b
|
14,000
|
1,000
|
,854
|
|
Roy’s Largest Root
|
5,917
|
,423b
|
14,000
|
1,000
|
,854
|
|
Counseling * compliance
|
Pillai’s Trace
|
,668
|
,402b
|
5,000
|
1,000
|
,824
|
Wilks’ Lambda
|
,332
|
,402b
|
5,000
|
1,000
|
,824
|
|
Hotelling’s Trace
|
2,011
|
,402b
|
5,000
|
1,000
|
,824
|
|
Roy’s Largest Root
|
2,011
|
,402b
|
5,000
|
1,000
|
,824
|
After including the second predictor
variable the MANOVA is not significant anymore. Probably, the
second predictor is a confounder of the first one. The analysis of
this model stops here.
5 Second Data Example
As a second example we use the data
from Field (Discovering SPSS, Sage London, 2005, p 571) assessing
the effect of three treatment modalities on compulsive behavior
disorder estimated by two scores, a thought-score and an
action-score (Var = variable).
Action
|
Thought
|
Treatment
|
5,00
|
14,00
|
1,00
|
5,00
|
11,00
|
1,00
|
4,00
|
16,00
|
1,00
|
4,00
|
13,00
|
1,00
|
5,00
|
12,00
|
1,00
|
3,00
|
14,00
|
1,00
|
7,00
|
12,00
|
1,00
|
6,00
|
15,00
|
1,00
|
6,00
|
16,00
|
1,00
|
4,00
|
11,00
|
1,00
|
The entire data file is in
extras.springer.com, and is entitled “chapter18multivariateanova”.
Start by opening the data file. The module General Linear Model
consists of four statistical models:
-
Univariate,
-
Multivariate,
-
Repeated Measures,
-
Variance Components.
We will use here again the statistical
model Multivariate.
Command:
-
Analyze….General Linear Model.…Multivariate.…In dialog box Multivariate transfer “action” and “thought” to Dependent Variables and “treatment” to Fixed Factors .…OK.
Multivariate testsa
Effect
|
Value
|
F
|
Hypothesis df
|
Error df
|
Sig.
|
|
---|---|---|---|---|---|---|
Intercept
|
Pillai’s Trace
|
,983
|
745,230b
|
2,000
|
26,000
|
,000
|
Wilks’Lambda
|
,017
|
745,230b
|
2,000
|
26,000
|
,000
|
|
Hotelling’s Trace
|
57,325
|
745,230b
|
2,000
|
26,000
|
,000
|
|
Roy’s Largest Root
|
57,325
|
745,230b
|
2,000
|
26,000
|
,000
|
|
treatment
|
Pillai’s Trace
|
,318
|
2,557
|
4,000
|
54,000
|
,049
|
Wilks’Lambda
|
,699
|
2,555b
|
4,000
|
52,000
|
,050
|
|
Hotelling’s Trace
|
,407
|
2,546
|
4,000
|
50,000
|
,051
|
|
Roy’s Largest Root
|
,335
|
4,520c
|
2,000
|
27,000
|
,020
|
The Pillai test shows that the
predictor (treatment modality) has a significant effect on both
thoughts and actions at p = 0,049. Roy’s test being less robust
gives an even better p-value of 0,020.
We will use again ANOVAs to find out
which of the two outcomes is more important.
Command:
-
Analyze.…General Linear Model….Univariate.…In dialog box Univariate transfer “actions” to Dependent variables and “treatment” to Fixed factors.…OK.
Do the same for variable “thought”.
Tests of between-subjects effects
Source
|
Type III sum of squares
|
df
|
Mean square
|
F
|
Sig.
|
---|---|---|---|---|---|
Corrected model
|
10,467a
|
2
|
5,233
|
2,771
|
,080
|
Intercept
|
616,533
|
1
|
616,533
|
326,400
|
,000
|
Treatment
|
10,467
|
2
|
5,233
|
2,771
|
,080
|
Error
|
51,000
|
27
|
1,889
|
||
Total
|
678,000
|
30
|
|||
Corrected total
|
61,467
|
29
|
Tests of between-subjects effects
Source
|
Type III sum of squares
|
df
|
Mean square
|
F
|
Sig.
|
---|---|---|---|---|---|
Corrected model
|
19,467a
|
2
|
9,733
|
2,154
|
,136
|
Intercept
|
6336,533
|
1
|
6336,533
|
1402,348
|
,000
|
Treatment
|
19,467
|
2
|
9,733
|
2,154
|
,136
|
Error
|
122,000
|
27
|
4,519
|
||
Total
|
6478,000
|
30
|
|||
Corrected total
|
141,467
|
29
|
The above two tables show that in the
ANOVAs nor thoughts nor actions are significant outcomes of
treatment modality anymore at p < 0,05. This would mean that the
treatment modality is a rather weak predictor of either of the
outcomes, and that it is not able to significantly predict a single
outcome, but that it significantly predicts two outcomes pointing
into a similar direction.
What advantages does MANOVA offer
compared to multiple ANOVAs.
1.
It prevents the type I error from
being inflated.
2.
It looks at interactions between
dependent variables.
3.
It can detect subgroup properties and
includes them in the analysis.
4.
It can demonstrate otherwise
underpowered effects.
Multivariate analysis should not be
used for explorative purposes and data dredging, but should be
based on sound clinical arguments.
A problem with multivariate analysis
with binary outcome variables is that after iteration the data
often do not converse. Instead multivariate probit analysis
available in STATA statistical software can be performed (see Chap.
25 in. Statistics Applied to clinical studies 5th edition, Springer
Heidelberg Germany, 2012, from the same authors)
6 Conclusion
Multivariate analysis, simultaneously,
assesses the separate effects of the predictors on one outcome
variable adjusted for another outcome variable. For example, it can
answer clinically important questions like: does drug-compliance
not only predict drug efficacy, but also, independently of the
first effect, predict quality of life. Path statistics can be used
as an alternative approach to multivariate analysis of variance
(MANOVA) (Chap. 17). However, MANOVA is the real
thing, because it produces an overall level of significance of a
predictive model with multiple outcome and predictor variables.
Post hoc ANOVAS are required to find out which of the outcomes is
more important.
7 Note
More background, theoretical, and
mathematical information of multivariate analysis with path
statistics is given in Statistics applied to clinical trials 5th
edition, Chap. 25, Springer Heidelberg Germany, 2012, from the same
authors.