Day 4 Practical: Post-graduate Statistics Course

I have organised the grumpiness score data in several formats. In photo_long.csv and photo_wide.csv there is the average grumpiness score for each photo where I’ve averaged over respondents. In the long format there is a row for each photo and the column person refers to the person being photographed. In the wide format there is a row for each person and the scores under happy conditions and grumpy conditions are in the columns y.grumpy and y.happy respectively.

Using the long data, estimate the effect of happy and grumpy on the scores using both lm and lmer.

Hint

You will probably get a meaningless error when you first do this. It is caused by there being missing values in the response variable (i.e. not all respondents have scored all 44 photos) and/or predictors, so you will need to fix this.

Solution

The simple lm model we’ve fitted previously:

m1<-lm(y~type, data=photo_long)
summary(m1)

## 
## Call:
## lm(formula = y ~ type, data = photo_long)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.1955 -0.9350 -0.1060  0.8071  2.6815 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   5.3759     0.2619  20.529  < 2e-16 ***
## typehappy    -1.2283     0.3703  -3.317  0.00189 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.228 on 42 degrees of freedom
## Multiple R-squared:  0.2076, Adjusted R-squared:  0.1887 
## F-statistic:    11 on 1 and 42 DF,  p-value: 0.001886

For the random effect model the obvious thing to do would fit person effects as random since each person has been photographed twice:

m2<-lmer(y~type+(1|person), data=photo_long)
summary(m2)

## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ type + (1 | person)
##    Data: photo_long
## 
## REML criterion at convergence: 135.7
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.0521 -0.5446 -0.1781  0.5511  1.8192 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  person   (Intercept) 0.8005   0.8947  
##  Residual             0.7082   0.8416  
## Number of obs: 44, groups:  person, 22
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   5.3759     0.2619  20.529
## typehappy    -1.2283     0.2537  -4.841
## 
## Correlation of Fixed Effects:
##           (Intr)
## typehappy -0.484

our best estimate is that roughly half the variance in grumpy score can be attributed to the person photographed.

Do the standard errors change in the way you expected them to?

Solution

Yes! The standard error of the difference goes down in the mixed model because each type of photo is scored within each random effect.

Using the function (t.test) (and the wide data) try and get the same answer as in the linear mixed model (you will need to read the t.test help file).

We can pass the two sets of scores to the t.test function. If we rely on the default arguments we get

t.test(photo_wide$y.happy, photo_wide$y.grumpy)

## 
##  Welch Two Sample t-test
## 
## data:  photo_wide$y.happy and photo_wide$y.grumpy
## t = -3.3168, df = 33.555, p-value = 0.002197
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.9813329 -0.4753408
## sample estimates:
## mean of x mean of y 
##  4.147528  5.375865

and the t-value and p-value are different from what we got from the mixed model. The t-value is smaller, and consequently the p-value is larger. However, the t.test function defaults to fitting a non-paired t-test where it is not taken into account that the scores can be grouped into pairs based on who was photographed. By specifying paired=TRUE we can fit a paired t-test:

t.test(photo_wide$y.happy, photo_wide$y.grumpy, paired=TRUE)

## 
##  Paired t-test
## 
## data:  photo_wide$y.happy and photo_wide$y.grumpy
## t = -4.8409, df = 21, p-value = 8.735e-05
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -1.7560238 -0.7006499
## sample estimates:
## mean difference 
##       -1.228337

and we can see that the t-value and p-value are identical to what we got using the mixed model. This is not a fluke. However, the mixed model is more flexible because we can add covariates and we can deal with the fact that we may have more than two photos for each person, or indeed people may have been photographed a variable number of times (including once).

The file photo_long_full.csv is 5368 lines long. y is the score that each respondent gave to each photo. In addition to some information about the person photographed (age and fpub) there is also some information about the respondent (whether they are in IEB or whether they are a student). Using lmer

Estimate the amount of variation due to respondent and person and try and think what this means.

Solution

m3<-lmer(y~type+(1|person)+(1|respondent), data=photo_long_full, na.action=na.omit)
summary(m3)

## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ type + (1 | person) + (1 | respondent)
##    Data: photo_long_full
## 
## REML criterion at convergence: 21779.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.8599 -0.6112 -0.0428  0.5783  4.4956 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  respondent (Intercept) 0.5287   0.7271  
##  person     (Intercept) 1.1416   1.0684  
##  Residual               3.2102   1.7917  
## Number of obs: 5350, groups:  respondent, 122; person, 22
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  5.37354    0.23963   22.42
## typehappy   -1.22946    0.04899  -25.09
## 
## Correlation of Fixed Effects:
##           (Intr)
## typehappy -0.102

The total variation (obtained by summing our 3 variance estimates) is approximately 5. Consequently, our best estimate is that around 10% of the variance in scores can be attributed to respondent, and around 20% to the person being photographed. We can interpret a non-zero respondent variance as meaning that people vary in how grumpy they perceive people: some respondents see the good in people, others the bad. We can interpret a non-zero person variance as meaning that people vary in how grumpy they look: some people just look grumpier than others irrespective of the conditions.

If we look at the confidence intervals for these variances (the standard deviations actually):

confint(m3)

## Computing profile confidence intervals ...

##                  2.5 %     97.5 %
## .sig01       0.6316426  0.8422168
## .sig02       0.7954020  1.4551122
## .sigma       1.7576751  1.8265096
## (Intercept)  4.8954601  5.8516259
## typehappy   -1.3254962 -1.1334310

we can see that there quite well estimated, and there’s good evidence that both random-effect variances are non-zero.

Although the function confint is useful for putting confidence intervals on parameters it can’t be used to put confidence intervals on functions of parameters. The proportion of the variation explained by one particular term (e.g. person) is a function of the three variance parameters. Try putting confidence intervals on the proportions of variance explained by person and respondent using the parametric bootstrap.

Hint

Rather than rooting around in the lmer summary for the estimates of the variance components, on slide 87 of the Mixed Models I pdf (with notes) I have written a function for extracting them from a lmer model fit:

coefv<-function(x, var=TRUE){

  if(!class(x)%in%c("lmerMod", "glmerMod")){
    stop("x should be a lmer or glmer model")
  }

  vrandom<-unlist(lapply(VarCorr(x), function(x){attr(x, "stddev")}))
  # get estimates of random effect standard deviations.

  verror<-attr(VarCorr(x), "sc")
  names(verror)<-"Residual"
  # get estimate of residual standard deviations, and name it as such.

  v<-c(vrandom, verror)

  if(var){
    v<-v^2
    # square the standard deviations to obtain the variance estimates.
  }

  return(v)
}

Solution

We can start by simulating replicate data from the model

sim_y<-simulate(m3,1000)

However, you will noticed that values have been simulated for those observations which were missing. This is not ideal because the parametric bootstrap will then be anti-conservative because it thinks we have more information than we do. Rather than using na.action=na.omit lets refit the model to the subset of the data where the scores are not missing.

m3b<-lmer(y~type+(1|person)+(1|respondent), data=subset(photo_long_full, !is.na(y)))

As you can see, it gives identical answers:

summary(m3b)

## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ type + (1 | person) + (1 | respondent)
##    Data: subset(photo_long_full, !is.na(y))
## 
## REML criterion at convergence: 21779.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.8599 -0.6112 -0.0428  0.5783  4.4956 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  respondent (Intercept) 0.5287   0.7271  
##  person     (Intercept) 1.1416   1.0684  
##  Residual               3.2102   1.7917  
## Number of obs: 5350, groups:  respondent, 122; person, 22
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  5.37354    0.23963   22.42
## typehappy   -1.22946    0.04899  -25.09
## 
## Correlation of Fixed Effects:
##           (Intr)
## typehappy -0.102

and we can resimulate our data

sim_y<-simulate(m3b,1000)

Here I’ve written a function to extract the estimates of the standard deviations (or variances if var=TRUE) from a lmer model:

coefv<-function(x, var=TRUE){

  if(!class(x)%in%c("lmerMod", "glmerMod")){
    stop("x should be a lmer or glmer model")
  }

  vrandom<-unlist(lapply(VarCorr(x), function(x){attr(x, "stddev")}))
  # get estimates of random effect standard deviations.

  verror<-attr(VarCorr(x), "sc")
  names(verror)<-"Residual"
  # get estimate of residual standard deviations, and name it as such.

  v<-c(vrandom, verror)

  if(var){
    v<-v^2
    # square the standard deviations to obtain the variance estimates.
  }

  return(v)
}

We can then fit our model to each replicate data set and extract the variance estimates.

vest<-apply(sim_y,2, function(x){coefv(refit(m3b,x))})

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.00475863 (tol = 0.002, component 1)

We can calculate the proportion of variance explained by each term for each replicate data set:

prop<-apply(vest, 2, function(x){x/sum(x)})

The confidence intervals on the proportions of variance explained are then

apply(prop, 1, quantile, prob=c(0.025, 0.975))

##       respondent.(Intercept) person.(Intercept)  Residual
## 2.5%               0.0796774          0.1278855 0.5652675
## 97.5%              0.1447223          0.3467477 0.7530505

(Hard Question) Do you think there is evidence in the data that some respondents behave differently towards some people?

Solution

You probably have a gut feeling that you want to fit an interaction between person and respondent. However, you may be unsure how to do this in the context of a mixed model, and unsure what additional parameters would be estimated and how you would interpret them. The most straightforward approach is to fit the model:

m4<-lmer(y~type+(1|person)+(1|respondent)+(1|respondent:person), data=photo_long_full, na.action=na.omit)

where we fit the interaction using the standard : notation. Another way to think of this model is that it would be like pasting the respondent and person columns together to get a new set of factors and simply fitting their effects as random. For example, we could create the variable respondent_and_person:

photo_long_full$respondent_and_person<-paste(photo_long_full$respondent,  photo_long_full$person)

which would then have levels of the form 1 peter_k, 2 peter_k and 1 alex_t indicating the score is from Respondent 1 scoring Peter Keightley, Respondent 2 scoring Peter Keightley and Respondent 1 scoring Alex Twyford. If we then fitted (1|respondent_and_person) in the model instead of (1|respondent:person) we would get exactly the same answer. If we look at the results:

summary(m4)

## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ type + (1 | person) + (1 | respondent) + (1 | respondent:person)
##    Data: photo_long_full
## 
## REML criterion at convergence: 21679.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.6110 -0.5815 -0.0304  0.5525  3.8463 
## 
## Random effects:
##  Groups            Name        Variance Std.Dev.
##  respondent:person (Intercept) 0.6241   0.7900  
##  respondent        (Intercept) 0.5152   0.7178  
##  person            (Intercept) 1.1383   1.0669  
##  Residual                      2.6029   1.6134  
## Number of obs: 5350, groups:  
## respondent:person, 2682; respondent, 122; person, 22
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  5.37331    0.23911   22.47
## typehappy   -1.22958    0.04413  -27.86
## 
## Correlation of Fixed Effects:
##           (Intr)
## typehappy -0.092

we can see that about 10% of the variation is due to the interaction., and the effect is highly significant:

anova(m4,m3)

## refitting model(s) with ML (instead of REML)

## Data: photo_long_full
## Models:
## m3: y ~ type + (1 | person) + (1 | respondent)
## m4: y ~ type + (1 | person) + (1 | respondent) + (1 | respondent:person)
##    npar   AIC   BIC logLik deviance  Chisq Df Pr(>Chisq)    
## m3    5 21784 21817 -10887    21774                         
## m4    6 21686 21726 -10837    21674 99.568  1  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can interpret this interaction as as saying that if we control for how grumpy someone looks (as measured through the average grumpy score for that person over respondents) and we control for how unsympathetic a respondent is (as measured through the average score they give over all respondents) then we still see that pairs of scores for the same person by the same respondent deviate in the same direction. For example Respondent 1 might be a positive person so on average will give a score 1 unit lower than the average. Similarly, Mark Woolhouse might look grumpy and so on average gets a grumpy score 2 units greater than the average. Consequently, we expect Respondent 1 to give Mark Woolhouse a score of 1. However, if we looked at the pair of scores that the Respondent gave Mark Woolhouse we might see that they were both greater than 1. This might indicate that for some reason Respondent 1 thinks Mark Woolhouse is grumpier than we would expect based on their personality and Mark Woolhouse’s face. Obviously, with only 2 scores the evidence for this is pretty weak. However, we have \(122*22=2684\) pairs of observations and so this gives us plenty of information test whether this type of interaction exists, even if we can’t get precise information for any one interaction.

(Very Hard Question) We might expect the scores to be different had we picked people the respondents did not know. If we can assume that non-IEB people know the photographed people less well, come up with a way to test whether respondents score the photographs differently if they know the person.

Hint

You will not be able to answer this question without understanding the material in the final lecture. However, it is still useful to think how there can be information in the data to address this question, and try and formulate what type of effects you would like in the model even if you can’t do it.

Solution

We could start with a simple model where we fit an IEB effect as fixed

m5<-lmer(y~type+IEB+(1|person)+(1|respondent)+(1|respondent:person), data=photo_long_full, na.action=na.omit)
summary(m5)

## Linear mixed model fit by REML ['lmerMod']
## Formula: 
## y ~ type + IEB + (1 | person) + (1 | respondent) + (1 | respondent:person)
##    Data: photo_long_full
## 
## REML criterion at convergence: 21674.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.5983 -0.5796 -0.0334  0.5531  3.8571 
## 
## Random effects:
##  Groups            Name        Variance Std.Dev.
##  respondent:person (Intercept) 0.6241   0.7900  
##  respondent        (Intercept) 0.4842   0.6958  
##  person            (Intercept) 1.1383   1.0669  
##  Residual                      2.6029   1.6134  
## Number of obs: 5350, groups:  
## respondent:person, 2682; respondent, 122; person, 22
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  5.62409    0.25538  22.022
## typehappy   -1.22958    0.04413 -27.865
## IEBYES      -0.39230    0.14258  -2.751
## 
## Correlation of Fixed Effects:
##           (Intr) typhpp
## typehappy -0.086       
## IEBYES    -0.357  0.000

It seems IEB people score people to be less grumpy than non-IEB people, and the size of the estimated difference is probably not due to chance (the magnitude of the t-value exceeds 2). However, is it because IEB are nicer people and so they would have given lower scores irrespective of who was photographed, or is it because they know the people being photographed and people tend to give people they know less grumpy scores irrespective of there photo? Impossible to know from this model.

However, if people modulated their score based on the personality of the person, I might think that the person effects are more variable among IEB respondents than non-IEB respondents because they are being influenced by their knowledge of the persons personality. For example, if one of the people photographed is really nasty I might expect them to have a higher score among IEB respondents versus non-IEB respondents. Likewise, if one of the people photographed is a really wonderful person I might expect them to have a lower score among IEB respondents versus non-IEB respondents. Consequently, the person scores would take on more extreme values (would have higher variance). The model below allows the person effects to have different variances among IEB and non-IEB respondents, and also estimates the correlation between the two sets of person effects: to what degree are those people with high grumpy scores as assessed by IEB respondents also those people with high grumpy scores as assessed by non-IEB respondents?

m6<-lmer(y~type+IEB+(IEB-1|person)+(1|respondent)+(1|respondent:person), data=photo_long_full, na.action=na.omit)
summary(m6)

## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ type + IEB + (IEB - 1 | person) + (1 | respondent) + (1 |  
##     respondent:person)
##    Data: photo_long_full
## 
## REML criterion at convergence: 21665.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.6651 -0.5788 -0.0384  0.5520  3.8520 
## 
## Random effects:
##  Groups            Name        Variance Std.Dev. Corr
##  respondent:person (Intercept) 0.6047   0.7777       
##  respondent        (Intercept) 0.4849   0.6964       
##  person            IEBNO       1.2405   1.1138       
##                    IEBYES      1.1111   1.0541   0.97
##  Residual                      2.6028   1.6133       
## Number of obs: 5350, groups:  
## respondent:person, 2682; respondent, 122; person, 22
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  5.62406    0.26432  21.278
## typehappy   -1.22953    0.04413 -27.864
## IEBYES      -0.39222    0.15536  -2.525
## 
## Correlation of Fixed Effects:
##           (Intr) typhpp
## typehappy -0.083       
## IEBYES    -0.434  0.000

The different groups have similar variances and the correlation is close to one. Perhaps IEB respondents are just nicer! However, a hypothesis test suggests the two sets of person effects do have different distributions

anova(m6,m5)

## refitting model(s) with ML (instead of REML)

## Data: photo_long_full
## Models:
## m5: y ~ type + IEB + (1 | person) + (1 | respondent) + (1 | respondent:person)
## m6: y ~ type + IEB + (IEB - 1 | person) + (1 | respondent) + (1 | respondent:person)
##    npar   AIC   BIC logLik deviance  Chisq Df Pr(>Chisq)  
## m5    7 21681 21727 -10834    21667                       
## m6    9 21676 21735 -10829    21658 9.0975  2    0.01058 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

so perhaps there is some evidence people score photos differently depending on whether they know the person. Worth exploring, but the magnitude of the differences is pretty small.

Day 4 Practical: Post-graduate Statistics Course

Jarrod Hadfield