voice pitch data

d = read.csv("https://caleb-huo.github.io/teaching/data/politeness/politeness_data.csv")
str(d)

## 'data.frame':    84 obs. of  5 variables:
##  $ subject  : Factor w/ 6 levels "F1","F2","F3",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ gender   : Factor w/ 2 levels "F","M": 1 1 1 1 1 1 1 1 1 1 ...
##  $ scenario : int  1 1 2 2 3 3 4 4 5 5 ...
##  $ attitude : Factor w/ 2 levels "inf","pol": 2 1 2 1 2 1 2 1 2 1 ...
##  $ frequency: num  213 204 285 260 204 ...

d$attitude <- as.factor(d$attitude)
d$scenario0 <- as.factor(d$scenario)

subject: (6 subjects)
gender: (F/M)
scenario: different scenario (7 categories)
attitude:
- inf: informal situations (e.g., explaining to a friend why you’re late)
- pol: politeness (e.g., giving an excuse for being late to a professor)
frequency: outcome variable, frequency of voice pitch

voice pitch data

head(d)

##   subject gender scenario attitude frequency scenario0
## 1      F1      F        1      pol     213.3         1
## 2      F1      F        1      inf     204.5         1
## 3      F1      F        2      pol     285.1         2
## 4      F1      F        2      inf     259.7         2
## 5      F1      F        3      pol     203.9         3
## 6      F1      F        3      inf     286.9         3

with(d, table(subject, gender))

##        gender
## subject  F  M
##      F1 14  0
##      F2 14  0
##      F3 14  0
##      M3  0 14
##      M4  0 14
##      M7  0 14

which(apply(d,1,function(x) any(is.na(x))))

## [1] 39

visualize the data

library(ggplot2)
ggplot(aes(attitude, frequency, color = subject), data=d) + facet_grid(~subject) + geom_boxplot() + geom_jitter() + theme_bw()

## Warning: Removed 1 rows containing non-finite values (stat_boxplot).

## Warning: Removed 1 rows containing missing values (geom_point).

Fit linear model

Outcome variable: frequency
primary predictor: attitude
important covariate varaible: gender
\(\varepsilon\): deviations from our predictions due to random factors that we cannot control experimentally

\[pitch \sim politeness + gender + \varepsilon\]

Linear model in R

lmFit <- lm(frequency ~ attitude + gender, data = d)
summary(lmFit)

## 
## Call:
## lm(formula = frequency ~ attitude + gender, data = d)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -82.409 -26.561  -4.262  24.690 100.140 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  256.762      6.756  38.006   <2e-16 ***
## attitudepol  -19.553      7.833  -2.496   0.0146 *  
## genderM     -108.349      7.833 -13.832   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 35.68 on 80 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.7109, Adjusted R-squared:  0.7037 
## F-statistic: 98.38 on 2 and 80 DF,  p-value: < 2.2e-16

Note: 1 observation deleted due to missingness

Problem with linear model

independence: since each subject gave multiple responses, multiple responses from the same subject cannot be regarded as independent from each other.
under the same scenario, people should also react similarly, not totally independent.

data independence

pol_subj = subset(d, attitude == "pol")
inf_subj = subset(d, attitude == "inf")
pol_subj$attitude <- NULL
inf_subj$attitude <- NULL
names(pol_subj)[names(pol_subj) == "frequency"] <- "frequency_pol"
names(inf_subj)[names(inf_subj) == "frequency"] <- "frequency_inf"

subj_merge <- merge(pol_subj, inf_subj)
head(subj_merge)

##   subject gender scenario scenario0 frequency_pol frequency_inf
## 1      F1      F        1         1         213.3         204.5
## 2      F1      F        2         2         285.1         259.7
## 3      F1      F        3         3         203.9         286.9
## 4      F1      F        4         4         250.8         276.8
## 5      F1      F        5         5         231.9         252.4
## 6      F1      F        6         6         181.2         230.7

Subject dependence

ggplot(data=subj_merge, aes(frequency_pol, frequency_inf)) + 
  geom_point(aes(col=subject)) + 
  geom_smooth(method="lm")

## Warning: Removed 1 rows containing non-finite values (stat_smooth).

## Warning: Removed 1 rows containing missing values (geom_point).

Subject dependence

ggplot(data=subj_merge, aes(frequency_pol, frequency_inf)) + 
  geom_point(aes(col=subject)) + 
  stat_ellipse(aes(col=subject)) +
  geom_smooth(method="lm")

## Warning: Removed 1 rows containing non-finite values (stat_ellipse).

## Warning: Removed 1 rows containing non-finite values (stat_smooth).

## Warning: Removed 1 rows containing missing values (geom_point).

scenario independence

ggplot(data=subj_merge, aes(frequency_pol, frequency_inf)) + 
  geom_point(aes(col=scenario0)) + 
  geom_smooth(method="lm")

## Warning: Removed 1 rows containing non-finite values (stat_smooth).

## Warning: Removed 1 rows containing missing values (geom_point).

scenario independence

ggplot(data=subj_merge, aes(frequency_pol, frequency_inf)) + 
  geom_point(aes(col=scenario0)) + 
  stat_ellipse(aes(col=scenario0)) +
  geom_smooth(method="lm")

## Warning: Removed 1 rows containing non-finite values (stat_ellipse).

## Warning: Removed 1 rows containing non-finite values (stat_smooth).

## Warning: Removed 1 rows containing missing values (geom_point).

Mixed model

Linear model is “fixed-effects only” model
- one or more fixed effects
- a general error term \(\varepsilon\)
- error term is considered no structure.
Mixed model is a mixture of fixed effect and random effect
- Fixed effects: the independent variables that are normally included in your analyses. For instance: gender, age, your study conditions
- Random effects: the variables that are specific to your data sample. (E.g Subject)

Random effect

add a random effect for subject
allows us to resolve this non-independence data structure for each subject
allows missing data

Random Intercepts

linear model \[pitch \sim politeness + gender + \varepsilon\]
linear model with random effect (random intercept) \[pitch \sim politeness + gender + (1|subject) + \varepsilon\] \((1|subject)\) is the R syntax for a random intercept.
intercepts are different for each subject
“1” stands for the intercept in R
this formula indicates multiple responses per subject
resolves the non-independence multiple responses by the same subject
the formula still contains a general error term \(\varepsilon\)

Fit linear mixed model in R

In R, lmer() function in lme4 package will do the job

lm(frequency ~ attitude, data = d)

## 
## Call:
## lm(formula = frequency ~ attitude, data = d)
## 
## Coefficients:
## (Intercept)  attitudepol  
##      202.59       -18.23

library(lme4)

## Loading required package: Matrix

library(lmerTest) ## this package helps provide a p-value

## 
## Attaching package: 'lmerTest'

## The following object is masked from 'package:lme4':
## 
##     lmer

## The following object is masked from 'package:stats':
## 
##     step

tryCatch({lmer(frequency ~ attitude, data = d)}, error=function(e) print(e))

## <simpleError: No random effects terms specified in formula>

You have to specify random effect

Fit linear mixed model in R

rs_subj_reml = lmer(frequency ~ attitude + (1 | subject), data = d)
summary(rs_subj_reml)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: frequency ~ attitude + (1 | subject)
##    Data: d
## 
## REML criterion at convergence: 804.7
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.2953 -0.6018 -0.2005  0.4774  3.1772 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subject  (Intercept) 3982     63.10   
##  Residual              851     29.17   
## Number of obs: 83, groups:  subject, 6
## 
## Fixed effects:
##             Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)  202.588     26.151   5.150   7.747 0.000501 ***
## attitudepol  -19.376      6.407  76.003  -3.024 0.003399 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr)
## attitudepol -0.121

Further explore rs_subj_reml

coef(rs_subj_reml)

## $subject
##    (Intercept) attitudepol
## F1    241.1352   -19.37584
## F2    266.8920   -19.37584
## F3    259.5540   -19.37584
## M3    179.0262   -19.37584
## M4    155.6906   -19.37584
## M7    113.2306   -19.37584
## 
## attr(,"class")
## [1] "coef.mer"

AIC(rs_subj_reml)

## [1] 812.7023

How to write the linear mixed model mathmatically

\[Y_{ijk} = \mu + \beta_j + \alpha_i + \varepsilon_{ijk}\]

\(i\): subject index
\(j\): attitude index
\(k\): scenario index
\(\mu\): intercept
\(\beta_j\): attitude coefficient of \(j^{th}\) group
\(\alpha_i \sim N(0, \sigma^2_\alpha)\): random intercept for the \(i^{th}\) subject
\(\varepsilon_{ijk} \sim N(0, \sigma^2)\):

Fit linear mixed model in R (REML = FALSE)

rs_subj_ml = lmer(frequency ~ attitude + (1 | subject), REML = FALSE, data = d)
summary(rs_subj_ml)

## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
##   method [lmerModLmerTest]
## Formula: frequency ~ attitude + (1 | subject)
##    Data: d
## 
##      AIC      BIC   logLik deviance df.resid 
##    826.5    836.2   -409.3    818.5       79 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.3067 -0.6051 -0.2042  0.4793  3.1931 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subject  (Intercept) 3308.7   57.52   
##  Residual              839.9   28.98   
## Number of obs: 83, groups:  subject, 6
## 
## Fixed effects:
##             Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)  202.588     23.905   6.214   8.475 0.000122 ***
## attitudepol  -19.372      6.365  77.003  -3.043 0.003199 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr)
## attitudepol -0.131

REML represents restricted maximum likelihood

if REML=FALSE, maximum likelihood estimation will be used
its estimation does not base estimates on a maximum likelihood fit of all the information
but instead uses a likelihood function calculated from a transformed set of data
nuisance parameters have no effect.

Compare to linear model

summary(lm(frequency ~ attitude, data = d))

## 
## Call:
## lm(formula = frequency ~ attitude, data = d)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -103.488  -62.122    9.044   51.178  105.044 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   202.59      10.08  20.107   <2e-16 ***
## attitudepol   -18.23      14.34  -1.272    0.207    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 65.3 on 81 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.01958,    Adjusted R-squared:  0.007475 
## F-statistic: 1.618 on 1 and 81 DF,  p-value: 0.2071

Without adjusting for random effect, attitude won’t be a significant predictor.

Getting p-values

rs_subj_reml = lmer(frequency ~ attitude + (1 | subject), data = d) 
anova(rs_subj_reml, ddf = "Kenward-Roger")

## Type III Analysis of Variance Table with Kenward-Roger's method
##          Sum Sq Mean Sq NumDF  DenDF F value   Pr(>F)   
## attitude 7782.8  7782.8     1 76.002  9.1456 0.003399 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## likelihood ratio test: comparing full model and reduced model
## get the p-value for reduced variable
## have to use REML = FALSE
rs_subj_ml = lmer(frequency ~ attitude + (1 | subject), REML = FALSE, data = d)
rs_null_ml = lmer(frequency ~ 1 + (1 | subject), REML = FALSE, data = d)

anova(rs_null_ml, rs_subj_ml)

## Data: d
## Models:
## rs_null_ml: frequency ~ 1 + (1 | subject)
## rs_subj_ml: frequency ~ attitude + (1 | subject)
##            Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)   
## rs_null_ml  3 833.25 840.51 -413.62   827.25                            
## rs_subj_ml  4 826.51 836.18 -409.25   818.51 8.7444      1   0.003106 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

add another covariate gender and its interaction with attitude

summary(lmer(frequency ~ attitude * gender +  (1 | subject), data = d))

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: frequency ~ attitude * gender + (1 | subject)
##    Data: d
## 
## REML criterion at convergence: 778.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.2334 -0.5105 -0.1551  0.4738  3.0875 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subject  (Intercept) 605.4    24.61   
##  Residual             844.1    29.05   
## Number of obs: 83, groups:  subject, 6
## 
## Fixed effects:
##                     Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept)          260.686     15.557    4.754  16.757 2.07e-05 ***
## attitudepol          -27.400      8.966   75.009  -3.056  0.00311 ** 
## genderM             -116.195     22.000    4.754  -5.282  0.00376 ** 
## attitudepol:genderM   16.190     12.763   75.018   1.269  0.20852    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) atttdp gendrM
## attitudepol -0.288              
## genderM     -0.707  0.204       
## atttdpl:gnM  0.202 -0.703 -0.286

Investigate scenario

ggplot(aes(1, frequency, col=scenario0), data=d) + 
  geom_boxplot() + 
  geom_jitter() + 
  facet_grid(attitude ~ scenario0)  + 
  theme_bw()

## Warning: Removed 1 rows containing non-finite values (stat_boxplot).

## Warning: Removed 1 rows containing missing values (geom_point).

A larger mixed model

rs_intergen_subjscene_ml = lmer(frequency ~ attitude + gender + (1 | subject) + 
    (1 | scenario), REML = FALSE, data = d)
summary(rs_intergen_subjscene_ml)

## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
##   method [lmerModLmerTest]
## Formula: frequency ~ attitude + gender + (1 | subject) + (1 | scenario)
##    Data: d
## 
##      AIC      BIC   logLik deviance df.resid 
##    807.1    821.6   -397.6    795.1       77 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.2958 -0.6456 -0.0777  0.5449  3.5121 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  scenario (Intercept) 205.2    14.32   
##  subject  (Intercept) 417.1    20.42   
##  Residual             637.4    25.25   
## Number of obs: 83, groups:  scenario, 7; subject, 6
## 
## Fixed effects:
##             Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept)  256.847     13.828    8.497  18.575 3.55e-08 ***
## attitudepol  -19.722      5.547   70.920  -3.555 0.000677 ***
## genderM     -108.517     17.573    5.928  -6.175 0.000868 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) atttdp
## attitudepol -0.201       
## genderM     -0.635  0.004

A larger mixed model

coef(rs_intergen_subjscene_ml)

## $scenario
##   (Intercept) attitudepol   genderM
## 1    243.4865   -19.72206 -108.5173
## 2    263.3589   -19.72206 -108.5173
## 3    268.1317   -19.72206 -108.5173
## 4    277.2536   -19.72206 -108.5173
## 5    254.9320   -19.72206 -108.5173
## 6    244.8021   -19.72206 -108.5173
## 7    245.9623   -19.72206 -108.5173
## 
## $subject
##    (Intercept) attitudepol   genderM
## F1    243.3681   -19.72206 -108.5173
## F2    266.9444   -19.72206 -108.5173
## F3    260.2277   -19.72206 -108.5173
## M3    284.3541   -19.72206 -108.5173
## M4    262.0576   -19.72206 -108.5173
## M7    224.1286   -19.72206 -108.5173
## 
## attr(,"class")
## [1] "coef.mer"

Random Slopes

ggplot(aes(attitude, frequency, color = subject), data=d) + 
  geom_point() + 
  geom_smooth(method="lm", aes(group=1)) + 
  facet_wrap(~subject ) + 
  theme_bw()

## Warning: Removed 1 rows containing non-finite values (stat_smooth).

## Warning: Removed 1 rows containing missing values (geom_point).

Random Slopes

ggplot(data=subj_merge, aes(frequency_pol, frequency_inf, col=scenario0)) + 
  geom_point() + 
  geom_smooth(method="lm", se = F)

## Warning: Removed 1 rows containing non-finite values (stat_smooth).

## Warning: Removed 1 rows containing missing values (geom_point).

Random Slopes in R

politeness.model.rs = lmer(frequency ~  attitude + gender + (1 + attitude | subject) + 
    (1 | scenario), REML = FALSE, data = d)

## boundary (singular) fit: see ?isSingular

summary(politeness.model.rs)

## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
##   method [lmerModLmerTest]
## Formula: frequency ~ attitude + gender + (1 + attitude | subject) + (1 |  
##     scenario)
##    Data: d
## 
##      AIC      BIC   logLik deviance df.resid 
##    811.1    830.4   -397.5    795.1       75 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.2800 -0.6555 -0.0803  0.5369  3.5016 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  scenario (Intercept) 205.052  14.320       
##  subject  (Intercept) 395.469  19.886       
##           attitudepol   1.251   1.118   1.00
##  Residual             637.069  25.240       
## Number of obs: 83, groups:  scenario, 7; subject, 6
## 
## Fixed effects:
##             Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept)  257.802     13.683    7.975  18.842 6.75e-08 ***
## attitudepol  -19.715      5.565   58.506  -3.543 0.000785 ***
## genderM     -110.428     17.531    5.927  -6.299 0.000782 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) atttdp
## attitudepol -0.153       
## genderM     -0.641  0.003
## convergence code: 0
## boundary (singular) fit: see ?isSingular

Random Slopes in R

coef(politeness.model.rs)

## $scenario
##   (Intercept) attitudepol   genderM
## 1    244.4392   -19.71467 -110.4277
## 2    264.3110   -19.71467 -110.4277
## 3    269.0836   -19.71467 -110.4277
## 4    278.2053   -19.71467 -110.4277
## 5    255.8843   -19.71467 -110.4277
## 6    245.7753   -19.71467 -110.4277
## 7    246.9149   -19.71467 -110.4277
## 
## $subject
##    (Intercept) attitudepol   genderM
## F1    243.7394   -20.50555 -110.4277
## F2    266.7317   -19.21246 -110.4277
## F3    260.1367   -19.58336 -110.4277
## M3    285.4172   -18.16159 -110.4277
## M4    263.9094   -19.37119 -110.4277
## M7    226.8772   -21.45387 -110.4277
## 
## attr(,"class")
## [1] "coef.mer"

Is random slope necessary

rs_gen_subjscene_con_reml = lmer(frequency ~ attitude + gender + (1 + attitude | 
    subject) + (1 | scenario), REML = TRUE, data = d)

## boundary (singular) fit: see ?isSingular

rs_gen_subjscene_reml = lmer(frequency ~ attitude + gender + (1 | subject) + (1 | 
    scenario), REML = TRUE, data = d)

anova(rs_gen_subjscene_reml, rs_gen_subjscene_con_reml)

## refitting model(s) with ML (instead of REML)

## Data: d
## Models:
## rs_gen_subjscene_reml: frequency ~ attitude + gender + (1 | subject) + (1 | scenario)
## rs_gen_subjscene_con_reml: frequency ~ attitude + gender + (1 + attitude | subject) + (1 | 
## rs_gen_subjscene_con_reml:     scenario)
##                           Df    AIC    BIC  logLik deviance  Chisq Chi Df
## rs_gen_subjscene_reml      6 807.10 821.61 -397.55   795.10              
## rs_gen_subjscene_con_reml  8 811.07 830.42 -397.54   795.07 0.0275      2
##                           Pr(>Chisq)
## rs_gen_subjscene_reml               
## rs_gen_subjscene_con_reml     0.9863

it appears that we don’t need to include random slope for condition in the model
however, others would argue that we should keep our models maximal

Beyond linear mixed model

If the outcome variable is

Continuous: use a linear regression model with mixed effects
Binary: use a logistic regression model with mixed effects
Count: use a Poisson/negative binomial regression model with mixed effects

Function lmer is used to fit linear mixed models, function glmer is used to fit generalized (non-Gaussian) linear mixed models.

Another example sleep study

sleepStudy <- read.table("https://caleb-huo.github.io/teaching/data/sleep/sleepstudy.txt", header=T, row.names = 1)
sleepStudy$Subject <- as.factor(sleepStudy$Subject)

head(sleepStudy, n=10)

##    Reaction Days Subject
## 1  249.5600    0     308
## 2  258.7047    1     308
## 3  250.8006    2     308
## 4  321.4398    3     308
## 5  356.8519    4     308
## 6  414.6901    5     308
## 7  382.2038    6     308
## 8  290.1486    7     308
## 9  430.5853    8     308
## 10 466.3535    9     308

str(sleepStudy)

## 'data.frame':    180 obs. of  3 variables:
##  $ Reaction: num  250 259 251 321 357 ...
##  $ Days    : int  0 1 2 3 4 5 6 7 8 9 ...
##  $ Subject : Factor w/ 18 levels "308","309","310",..: 1 1 1 1 1 1 1 1 1 1 ...

Sleep deprivation data

This laboratory experiment measured the effect of sleep deprivation on cognitive performance.
There were 18 subjects, chosen from the population of interest (long-distance truck drivers), in the 10 day trial. These subjects were restricted to 3 hours sleep per night during the trial.
On each day of the trial each subject’s reaction time was measured. The reaction time shown here is the average of several measurements.
These data are balanced in that each subject is measured the same number of times and on the same occasions.

Simple longitudinal data

Repeated measures data consist of measurements of a response (and, perhaps, some covariates) on several experimental (or observational) units.
Longitudinal data are repeated measures data in which the observations are taken over time.
We wish to characterize the response over time within subjects and the variation in the time trends between subjects.

Data exploratory

ggplot(aes(Days, Reaction, col=Subject), data = sleepStudy) + geom_point() + geom_smooth(method="lm") + facet_wrap(~Subject)

Fit linear mixed model in R

fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
summary(fm1)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: Reaction ~ Days + (Days | Subject)
##    Data: sleepstudy
## 
## REML criterion at convergence: 1743.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.9536 -0.4634  0.0231  0.4633  5.1793 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  Subject  (Intercept) 611.90   24.737       
##           Days         35.08    5.923   0.07
##  Residual             654.94   25.592       
## Number of obs: 180, groups:  Subject, 18
## 
## Fixed effects:
##             Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)  251.405      6.824  17.005  36.843  < 2e-16 ***
## Days          10.467      1.546  16.995   6.771 3.27e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##      (Intr)
## Days -0.138

coef

coef(fm1)

## $Subject
##     (Intercept)       Days
## 308    253.6626 19.6665597
## 309    211.0108  1.8467699
## 310    212.4488  5.0177063
## 330    275.0940  5.6531411
## 331    273.6636  7.3976093
## 332    260.4439 10.1952325
## 333    268.2441 10.2438881
## 334    244.1731 11.5417935
## 335    251.0724 -0.2851939
## 337    286.2916 19.0963068
## 349    226.1971 11.6403856
## 350    238.3357 17.0815045
## 351    255.9828  7.4520035
## 352    272.2666 14.0036922
## 369    254.6802 11.3395736
## 370    225.7940 15.2895377
## 371    252.2122  9.4791130
## 372    263.7185 11.7515240
## 
## attr(,"class")
## [1] "coef.mer"

generate .R file

knitr::purl("mixedModel.rmd", output = "mixedModel.R ", documentation = 2)

## 
## 
## processing file: mixedModel.rmd

## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |.                                                                     |   2%
  |                                                                            
  |..                                                                    |   3%
  |                                                                            
  |....                                                                  |   5%
  |                                                                            
  |.....                                                                 |   7%
  |                                                                            
  |......                                                                |   9%
  |                                                                            
  |.......                                                               |  10%
  |                                                                            
  |........                                                              |  12%
  |                                                                            
  |..........                                                            |  14%
  |                                                                            
  |...........                                                           |  16%
  |                                                                            
  |............                                                          |  17%
  |                                                                            
  |.............                                                         |  19%
  |                                                                            
  |..............                                                        |  21%
  |                                                                            
  |................                                                      |  22%
  |                                                                            
  |.................                                                     |  24%
  |                                                                            
  |..................                                                    |  26%
  |                                                                            
  |...................                                                   |  28%
  |                                                                            
  |.....................                                                 |  29%
  |                                                                            
  |......................                                                |  31%
  |                                                                            
  |.......................                                               |  33%
  |                                                                            
  |........................                                              |  34%
  |                                                                            
  |.........................                                             |  36%
  |                                                                            
  |...........................                                           |  38%
  |                                                                            
  |............................                                          |  40%
  |                                                                            
  |.............................                                         |  41%
  |                                                                            
  |..............................                                        |  43%
  |                                                                            
  |...............................                                       |  45%
  |                                                                            
  |.................................                                     |  47%
  |                                                                            
  |..................................                                    |  48%
  |                                                                            
  |...................................                                   |  50%
  |                                                                            
  |....................................                                  |  52%
  |                                                                            
  |.....................................                                 |  53%
  |                                                                            
  |.......................................                               |  55%
  |                                                                            
  |........................................                              |  57%
  |                                                                            
  |.........................................                             |  59%
  |                                                                            
  |..........................................                            |  60%
  |                                                                            
  |...........................................                           |  62%
  |                                                                            
  |.............................................                         |  64%
  |                                                                            
  |..............................................                        |  66%
  |                                                                            
  |...............................................                       |  67%
  |                                                                            
  |................................................                      |  69%
  |                                                                            
  |.................................................                     |  71%
  |                                                                            
  |...................................................                   |  72%
  |                                                                            
  |....................................................                  |  74%
  |                                                                            
  |.....................................................                 |  76%
  |                                                                            
  |......................................................                |  78%
  |                                                                            
  |........................................................              |  79%
  |                                                                            
  |.........................................................             |  81%
  |                                                                            
  |..........................................................            |  83%
  |                                                                            
  |...........................................................           |  84%
  |                                                                            
  |............................................................          |  86%
  |                                                                            
  |..............................................................        |  88%
  |                                                                            
  |...............................................................       |  90%
  |                                                                            
  |................................................................      |  91%
  |                                                                            
  |.................................................................     |  93%
  |                                                                            
  |..................................................................    |  95%
  |                                                                            
  |....................................................................  |  97%
  |                                                                            
  |..................................................................... |  98%
  |                                                                            
  |......................................................................| 100%

## output file: mixedModel.R

## [1] "mixedModel.R "

Biostatistical Computing, PHC 6068

Linear Mixed Model

References:

From linear model to linear mixed model

voice pitch data

voice pitch data

visualize the data

Fit linear model

Linear model in R

Problem with linear model

data independence

Subject dependence

Subject dependence

scenario independence

scenario independence

Mixed model

Random effect

Random Intercepts

Fit linear mixed model in R

Fit linear mixed model in R

Further explore rs_subj_reml

How to write the linear mixed model mathmatically

Fit linear mixed model in R (REML = FALSE)

Compare to linear model

Getting p-values

add another covariate gender and its interaction with attitude

Investigate scenario

A larger mixed model

A larger mixed model

Random Slopes

Random Slopes

Random Slopes in R

Random Slopes in R

Is random slope necessary

Beyond linear mixed model

Another example sleep study

Sleep deprivation data

Simple longitudinal data

Data exploratory

Fit linear mixed model in R

coef

generate .R file