+ - 0:00:00
Notes for current slide
Notes for next slide

Model Diagnostics

PSYC 575

Mark Lai

University of Southern California

2021/09/25 (updated: 2022-10-02)

1 / 18

Week Learning Objectives

  • Describe the major assumptions in basic multilevel models

  • Conduct analyses to decide whether cluster means and random slopes should be included

  • Use graphical tools to diagnose assumptions of linearity, homoscedasticity (equal variance), and normality

  • Solve some basic convergence issues

  • Report results of a multilevel analysis based on established guidelines

2 / 18

Multilevel "Model" . . .

What is a model?

3 / 18

Multilevel "Model" . . .

What is a model?

It is a set of assumptions about how the data are generated

3 / 18

Two Components of a Parametric Model

Functional Form

E(Yij|X,W)=γ00+γ10X1ij++γ01W1j+

Versus:

E(Yij|X,W)=exp(γ00+γ10X1ij++γ01W1j+)

4 / 18

Two Components of a Parametric Model

Random Component

I.e., distribution of random effects/errors

[u0ju1j]N([00],[τ02τ01τ01τ12])

eijN(0,σ)

5 / 18

Two Components of a Parametric Model

Random Component

I.e., distribution of random effects/errors

[u0ju1j]N([00],[τ02τ01τ01τ12])

eijN(0,σ)

Versus eijt3(0,σ)

5 / 18

Two Components of a Parametric Model

Random Component

I.e., distribution of random effects/errors

[u0ju1j]N([00],[τ02τ01τ01τ12])

eijN(0,σ)

Versus eijt3(0,σ)

Or eijN(0,σj), where different clusters j have a different SD σj

5 / 18

Assumptions of Basic MLM

6 / 18

Five Assumptions in Normal Linear Models

Linearity

Independence of errors (at the highest level)

Normality

Equal variance of errors (i.e., homoscedasticity)

Correct Specification of the model

7 / 18

Five Assumptions in Normal Linear Models

Linearity

Independence of errors (at the highest level)

Normality

Equal variance of errors (i.e., homoscedasticity)

Correct Specification of the model

‍Importance: S, L, I > E, N

7 / 18

Assumptions Are Important

Your result is only as good as the assumptions

  • Garbage in, garbage out
8 / 18

Assumptions Are Important

Your result is only as good as the assumptions

  • Garbage in, garbage out

8 / 18

Correct Specification

Fixed effects

  • Cluster means should be included (unless between coefficient = within coefficient)

    • Otherwise, between and within coefficients are conflated
  • Relevant predictors should be included to answer the target research question

    • E.g., Gender gap vs. gender gap adjusting for profession
9 / 18

Correct Specification

Fixed effects

  • Cluster means should be included (unless between coefficient = within coefficient)

    • Otherwise, between and within coefficients are conflated
  • Relevant predictors should be included to answer the target research question

    • E.g., Gender gap vs. gender gap adjusting for profession

Random effects

  • If random slope variance is not zero, omitting it leads to inflated Type I error rates for fixed effects

    • Varying slopes could also be important information from the data
9 / 18

Linearity

Lack of linear association lack of association

10 / 18

Independence of Errors

We use MLM because students within the same school are more similar (i.e., not independent)

11 / 18

Independence of Errors

We use MLM because students within the same school are more similar (i.e., not independent)

If schools are from different school districts, they may also not be independent

  • Need a three-level model
11 / 18

Independence of Errors

We use MLM because students within the same school are more similar (i.e., not independent)

If schools are from different school districts, they may also not be independent

  • Need a three-level model

Or, student A in school 1 is from the same neighborhood as student B in school 2

  • Cross-classified model
11 / 18

Independence of Errors

We use MLM because students within the same school are more similar (i.e., not independent)

If schools are from different school districts, they may also not be independent

  • Need a three-level model

Or, student A in school 1 is from the same neighborhood as student B in school 2

  • Cross-classified model

Temporal dependence

  • E.g., Repeated measures closer in time are more similar

    • Autoregressive model
11 / 18

Equal Variance of Errors (Homoscedasticity)

Residual plots

12 / 18

Normality

Quantile-quantile (QQ) plot

  • Whether the 1st, 5th, 10th, ... percentiles of the residuals correspond to the 1st, 5th, 10th, ... percentiles of a normal distribution

Need to check both level 1 (e) and level 2 (u0 and u1)

13 / 18

Examples of data for which a normal model is not good

  • Binary/ordinal outcome with < 5 categories (including the homework)

  • Count data (e.g., # binge drinking episodes; # of success in 5 trials)

  • Bounded data with ceiling/floor effects (e.g., depressive symptoms)

  • Reaction time

14 / 18

Additional Issues

  • Outliers/influential observations
    • Check coding error
    • Don't drop outliers unless you adjust the standard errors accordingly, or use robust models
15 / 18

Additional Issues

  • Outliers/influential observations
    • Check coding error
    • Don't drop outliers unless you adjust the standard errors accordingly, or use robust models
  • Reliability (e.g., α coefficient)
15 / 18

Dealing With Convergence Issues

See R codes

16 / 18

Reporting Results

17 / 18

References

  • Chapter by McCoach (2019); Paper by Meteyard & Davies (2020)
18 / 18

References

  • Chapter by McCoach (2019); Paper by Meteyard & Davies (2020)

Things to report:

  • Sample sizes
  • Model equations
  • Decisions and justifications for including or not including cluster means, centering, and random slopes
  • Estimation methods, software program/package, and version number
  • Intraclass correlation
  • Convergence issues and handling
  • Assumptions
  • Tables of fixed and random effect coefficients
  • Effect size
  • Model comparison criteria and indices
  • Software code
18 / 18

Week Learning Objectives

  • Describe the major assumptions in basic multilevel models

  • Conduct analyses to decide whether cluster means and random slopes should be included

  • Use graphical tools to diagnose assumptions of linearity, homoscedasticity (equal variance), and normality

  • Solve some basic convergence issues

  • Report results of a multilevel analysis based on established guidelines

2 / 18
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow