I have organised the grumpiness score data in several formats. In photo_long.csv and photo_wide.csv there is the average grumpiness
score for each photo where I’ve averaged over respondents. In the long
format there is a row for each photo and the column person
refers to the person being photographed. In the wide format there is a
row for each person and the scores under happy conditions and grumpy
conditions are in the columns y.grumpy
and
y.happy
respectively.
lm
and lmer
.
Hint
You will probably get a meaningless error when you first do this. It is caused by there being missing values in the response variable (i.e. not all respondents have scored all 44 photos) and/or predictors, so you will need to fix this.
Solution
The simple lm
model we’ve fitted previously:
m1<-lm(y~type, data=photo_long)
summary(m1)
##
## Call:
## lm(formula = y ~ type, data = photo_long)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.1955 -0.9350 -0.1060 0.8071 2.6815
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.3759 0.2619 20.529 < 2e-16 ***
## typehappy -1.2283 0.3703 -3.317 0.00189 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.228 on 42 degrees of freedom
## Multiple R-squared: 0.2076, Adjusted R-squared: 0.1887
## F-statistic: 11 on 1 and 42 DF, p-value: 0.001886
For the random effect model the obvious thing to do would fit person effects as random since each person has been photographed twice:
m2<-lmer(y~type+(1|person), data=photo_long)
summary(m2)
## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ type + (1 | person)
## Data: photo_long
##
## REML criterion at convergence: 135.7
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.0521 -0.5446 -0.1781 0.5511 1.8192
##
## Random effects:
## Groups Name Variance Std.Dev.
## person (Intercept) 0.8005 0.8947
## Residual 0.7082 0.8416
## Number of obs: 44, groups: person, 22
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 5.3759 0.2619 20.529
## typehappy -1.2283 0.2537 -4.841
##
## Correlation of Fixed Effects:
## (Intr)
## typehappy -0.484
our best estimate is that roughly half the variance in grumpy score can be attributed to the person photographed.
Solution
Yes! The standard error of the difference goes down in the mixed model because each type of photo is scored within each random effect.
t.test
) (and the wide data) try and get
the same answer as in the linear mixed model (you will need to read the
t.test
help file).
We can pass the two sets of scores to the t.test
function. If we rely on the default arguments we get
t.test(photo_wide$y.happy, photo_wide$y.grumpy)
##
## Welch Two Sample t-test
##
## data: photo_wide$y.happy and photo_wide$y.grumpy
## t = -3.3168, df = 33.555, p-value = 0.002197
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.9813329 -0.4753408
## sample estimates:
## mean of x mean of y
## 4.147528 5.375865
and the t-value and p-value are different from what we got from the
mixed model. The t-value is smaller, and consequently the p-value is
larger. However, the t.test
function defaults to fitting a
non-paired t-test where it is not taken into account that the scores can
be grouped into pairs based on who was photographed. By specifying
paired=TRUE
we can fit a paired t-test:
t.test(photo_wide$y.happy, photo_wide$y.grumpy, paired=TRUE)
##
## Paired t-test
##
## data: photo_wide$y.happy and photo_wide$y.grumpy
## t = -4.8409, df = 21, p-value = 8.735e-05
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -1.7560238 -0.7006499
## sample estimates:
## mean difference
## -1.228337
and we can see that the t-value and p-value are identical to what we got using the mixed model. This is not a fluke. However, the mixed model is more flexible because we can add covariates and we can deal with the fact that we may have more than two photos for each person, or indeed people may have been photographed a variable number of times (including once).
The file photo_long_full.csv is
5368 lines long. y
is the score that each
respondent
gave to each photo
. In addition to
some information about the person photographed (age
and
fpub
) there is also some information about the respondent
(whether they are in IEB
or whether they are a
student
). Using lmer
Solution
m3<-lmer(y~type+(1|person)+(1|respondent), data=photo_long_full, na.action=na.omit)
summary(m3)
## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ type + (1 | person) + (1 | respondent)
## Data: photo_long_full
##
## REML criterion at convergence: 21779.2
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.8599 -0.6112 -0.0428 0.5783 4.4956
##
## Random effects:
## Groups Name Variance Std.Dev.
## respondent (Intercept) 0.5287 0.7271
## person (Intercept) 1.1416 1.0684
## Residual 3.2102 1.7917
## Number of obs: 5350, groups: respondent, 122; person, 22
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 5.37354 0.23963 22.42
## typehappy -1.22946 0.04899 -25.09
##
## Correlation of Fixed Effects:
## (Intr)
## typehappy -0.102
The total variation (obtained by summing our 3 variance estimates) is approximately 5. Consequently, our best estimate is that around 10% of the variance in scores can be attributed to respondent, and around 20% to the person being photographed. We can interpret a non-zero respondent variance as meaning that people vary in how grumpy they perceive people: some respondents see the good in people, others the bad. We can interpret a non-zero person variance as meaning that people vary in how grumpy they look: some people just look grumpier than others irrespective of the conditions.
If we look at the confidence intervals for these variances (the standard deviations actually):
confint(m3)
## Computing profile confidence intervals ...
## 2.5 % 97.5 %
## .sig01 0.6316426 0.8422168
## .sig02 0.7954020 1.4551122
## .sigma 1.7576751 1.8265096
## (Intercept) 4.8954601 5.8516259
## typehappy -1.3254962 -1.1334310
we can see that there quite well estimated, and there’s good evidence that both random-effect variances are non-zero.
confint
is useful for putting
confidence intervals on parameters it can’t be used to put confidence
intervals on functions of parameters. The proportion of the
variation explained by one particular term (e.g. person) is a function
of the three variance parameters. Try putting confidence intervals on
the proportions of variance explained by person and respondent using the
parametric bootstrap.
Hint
Rather than rooting around in the lmer
summary for the
estimates of the variance components, on slide 87 of the Mixed Models I
pdf (with notes) I have written a function for extracting them from a
lmer
model fit:
coefv<-function(x, var=TRUE){
if(!class(x)%in%c("lmerMod", "glmerMod")){
stop("x should be a lmer or glmer model")
}
vrandom<-unlist(lapply(VarCorr(x), function(x){attr(x, "stddev")}))
# get estimates of random effect standard deviations.
verror<-attr(VarCorr(x), "sc")
names(verror)<-"Residual"
# get estimate of residual standard deviations, and name it as such.
v<-c(vrandom, verror)
if(var){
v<-v^2
# square the standard deviations to obtain the variance estimates.
}
return(v)
}
Solution
We can start by simulating replicate data from the model
sim_y<-simulate(m3,1000)
However, you will noticed that values have been simulated for those
observations which were missing. This is not ideal because the
parametric bootstrap will then be anti-conservative because it thinks we
have more information than we do. Rather than using
na.action=na.omit
lets refit the model to the subset of the
data where the scores are not missing.
m3b<-lmer(y~type+(1|person)+(1|respondent), data=subset(photo_long_full, !is.na(y)))
As you can see, it gives identical answers:
summary(m3b)
## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ type + (1 | person) + (1 | respondent)
## Data: subset(photo_long_full, !is.na(y))
##
## REML criterion at convergence: 21779.2
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.8599 -0.6112 -0.0428 0.5783 4.4956
##
## Random effects:
## Groups Name Variance Std.Dev.
## respondent (Intercept) 0.5287 0.7271
## person (Intercept) 1.1416 1.0684
## Residual 3.2102 1.7917
## Number of obs: 5350, groups: respondent, 122; person, 22
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 5.37354 0.23963 22.42
## typehappy -1.22946 0.04899 -25.09
##
## Correlation of Fixed Effects:
## (Intr)
## typehappy -0.102
and we can resimulate our data
sim_y<-simulate(m3b,1000)
Here I’ve written a function to extract the estimates of the standard
deviations (or variances if var=TRUE
) from a lmer
model:
coefv<-function(x, var=TRUE){
if(!class(x)%in%c("lmerMod", "glmerMod")){
stop("x should be a lmer or glmer model")
}
vrandom<-unlist(lapply(VarCorr(x), function(x){attr(x, "stddev")}))
# get estimates of random effect standard deviations.
verror<-attr(VarCorr(x), "sc")
names(verror)<-"Residual"
# get estimate of residual standard deviations, and name it as such.
v<-c(vrandom, verror)
if(var){
v<-v^2
# square the standard deviations to obtain the variance estimates.
}
return(v)
}
We can then fit our model to each replicate data set and extract the variance estimates.
vest<-apply(sim_y,2, function(x){coefv(refit(m3b,x))})
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.00475863 (tol = 0.002, component 1)
We can calculate the proportion of variance explained by each term for each replicate data set:
prop<-apply(vest, 2, function(x){x/sum(x)})
The confidence intervals on the proportions of variance explained are then
apply(prop, 1, quantile, prob=c(0.025, 0.975))
## respondent.(Intercept) person.(Intercept) Residual
## 2.5% 0.0796774 0.1278855 0.5652675
## 97.5% 0.1447223 0.3467477 0.7530505
Solution
You probably have a gut feeling that you want to fit an interaction
between person
and respondent
. However, you
may be unsure how to do this in the context of a mixed model, and unsure
what additional parameters would be estimated and how you would
interpret them. The most straightforward approach is to fit the
model:
m4<-lmer(y~type+(1|person)+(1|respondent)+(1|respondent:person), data=photo_long_full, na.action=na.omit)
where we fit the interaction using the standard :
notation. Another way to think of this model is that it would be like
pasting the respondent
and person
columns
together to get a new set of factors and simply fitting their effects as
random. For example, we could create the variable
respondent_and_person
:
photo_long_full$respondent_and_person<-paste(photo_long_full$respondent, photo_long_full$person)
which would then have levels of the form 1 peter_k
,
2 peter_k
and 1 alex_t
indicating the score is
from Respondent 1 scoring Peter Keightley, Respondent 2 scoring Peter
Keightley and Respondent 1 scoring Alex Twyford. If we then fitted
(1|respondent_and_person)
in the model instead of
(1|respondent:person)
we would get exactly the same answer.
If we look at the results:
summary(m4)
## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ type + (1 | person) + (1 | respondent) + (1 | respondent:person)
## Data: photo_long_full
##
## REML criterion at convergence: 21679.8
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.6110 -0.5815 -0.0304 0.5525 3.8463
##
## Random effects:
## Groups Name Variance Std.Dev.
## respondent:person (Intercept) 0.6241 0.7900
## respondent (Intercept) 0.5152 0.7178
## person (Intercept) 1.1383 1.0669
## Residual 2.6029 1.6134
## Number of obs: 5350, groups:
## respondent:person, 2682; respondent, 122; person, 22
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 5.37331 0.23911 22.47
## typehappy -1.22958 0.04413 -27.86
##
## Correlation of Fixed Effects:
## (Intr)
## typehappy -0.092
we can see that about 10% of the variation is due to the interaction., and the effect is highly significant:
anova(m4,m3)
## refitting model(s) with ML (instead of REML)
## Data: photo_long_full
## Models:
## m3: y ~ type + (1 | person) + (1 | respondent)
## m4: y ~ type + (1 | person) + (1 | respondent) + (1 | respondent:person)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## m3 5 21784 21817 -10887 21774
## m4 6 21686 21726 -10837 21674 99.568 1 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We can interpret this interaction as as saying that if we control for how grumpy someone looks (as measured through the average grumpy score for that person over respondents) and we control for how unsympathetic a respondent is (as measured through the average score they give over all respondents) then we still see that pairs of scores for the same person by the same respondent deviate in the same direction. For example Respondent 1 might be a positive person so on average will give a score 1 unit lower than the average. Similarly, Mark Woolhouse might look grumpy and so on average gets a grumpy score 2 units greater than the average. Consequently, we expect Respondent 1 to give Mark Woolhouse a score of 1. However, if we looked at the pair of scores that the Respondent gave Mark Woolhouse we might see that they were both greater than 1. This might indicate that for some reason Respondent 1 thinks Mark Woolhouse is grumpier than we would expect based on their personality and Mark Woolhouse’s face. Obviously, with only 2 scores the evidence for this is pretty weak. However, we have \(122*22=2684\) pairs of observations and so this gives us plenty of information test whether this type of interaction exists, even if we can’t get precise information for any one interaction.
Hint
You will not be able to answer this question without understanding the material in the final lecture. However, it is still useful to think how there can be information in the data to address this question, and try and formulate what type of effects you would like in the model even if you can’t do it.
Solution
IEB
effect as fixed
m5<-lmer(y~type+IEB+(1|person)+(1|respondent)+(1|respondent:person), data=photo_long_full, na.action=na.omit)
summary(m5)
## Linear mixed model fit by REML ['lmerMod']
## Formula:
## y ~ type + IEB + (1 | person) + (1 | respondent) + (1 | respondent:person)
## Data: photo_long_full
##
## REML criterion at convergence: 21674.5
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.5983 -0.5796 -0.0334 0.5531 3.8571
##
## Random effects:
## Groups Name Variance Std.Dev.
## respondent:person (Intercept) 0.6241 0.7900
## respondent (Intercept) 0.4842 0.6958
## person (Intercept) 1.1383 1.0669
## Residual 2.6029 1.6134
## Number of obs: 5350, groups:
## respondent:person, 2682; respondent, 122; person, 22
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 5.62409 0.25538 22.022
## typehappy -1.22958 0.04413 -27.865
## IEBYES -0.39230 0.14258 -2.751
##
## Correlation of Fixed Effects:
## (Intr) typhpp
## typehappy -0.086
## IEBYES -0.357 0.000
It seems IEB people score people to be less grumpy than non-IEB people, and the size of the estimated difference is probably not due to chance (the magnitude of the t-value exceeds 2). However, is it because IEB are nicer people and so they would have given lower scores irrespective of who was photographed, or is it because they know the people being photographed and people tend to give people they know less grumpy scores irrespective of there photo? Impossible to know from this model.
However, if people modulated their score based on the personality of the person, I might think that the person effects are more variable among IEB respondents than non-IEB respondents because they are being influenced by their knowledge of the persons personality. For example, if one of the people photographed is really nasty I might expect them to have a higher score among IEB respondents versus non-IEB respondents. Likewise, if one of the people photographed is a really wonderful person I might expect them to have a lower score among IEB respondents versus non-IEB respondents. Consequently, the person scores would take on more extreme values (would have higher variance). The model below allows the person effects to have different variances among IEB and non-IEB respondents, and also estimates the correlation between the two sets of person effects: to what degree are those people with high grumpy scores as assessed by IEB respondents also those people with high grumpy scores as assessed by non-IEB respondents?
m6<-lmer(y~type+IEB+(IEB-1|person)+(1|respondent)+(1|respondent:person), data=photo_long_full, na.action=na.omit)
summary(m6)
## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ type + IEB + (IEB - 1 | person) + (1 | respondent) + (1 |
## respondent:person)
## Data: photo_long_full
##
## REML criterion at convergence: 21665.2
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.6651 -0.5788 -0.0384 0.5520 3.8520
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## respondent:person (Intercept) 0.6047 0.7777
## respondent (Intercept) 0.4849 0.6964
## person IEBNO 1.2405 1.1138
## IEBYES 1.1111 1.0541 0.97
## Residual 2.6028 1.6133
## Number of obs: 5350, groups:
## respondent:person, 2682; respondent, 122; person, 22
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 5.62406 0.26432 21.278
## typehappy -1.22953 0.04413 -27.864
## IEBYES -0.39222 0.15536 -2.525
##
## Correlation of Fixed Effects:
## (Intr) typhpp
## typehappy -0.083
## IEBYES -0.434 0.000
The different groups have similar variances and the correlation is close to one. Perhaps IEB respondents are just nicer! However, a hypothesis test suggests the two sets of person effects do have different distributions
anova(m6,m5)
## refitting model(s) with ML (instead of REML)
## Data: photo_long_full
## Models:
## m5: y ~ type + IEB + (1 | person) + (1 | respondent) + (1 | respondent:person)
## m6: y ~ type + IEB + (IEB - 1 | person) + (1 | respondent) + (1 | respondent:person)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## m5 7 21681 21727 -10834 21667
## m6 9 21676 21735 -10829 21658 9.0975 2 0.01058 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
so perhaps there is some evidence people score photos differently depending on whether they know the person. Worth exploring, but the magnitude of the differences is pretty small.