Week 14: Prediction

Prediction in MLM

Week Learning Objectives

By the end of this module, you will be able to

  • Describe the role of prediction in data analysis
  • Describe the problem of overfitting when fitting complex models
  • Use information criteria to compare models

Task List

  1. Review the resources (lecture videos and slides)
  2. Attend the Tuesday session and participate in the bonus class exercise
  3. Attend the Thursday session for class presentations

Lecture

Slides

PDF version

Think more

Think about a prominent theory in your area of research. What predictions does it make? Does it give precise predictions?

Check your learning
In a multilevel model with students nested within schools and with student math achievement as the outcome variable, what is a cluster-specific prediction?



Prediction Error

Some information has been updated since the video was recorded. Check out the updated slides

Check your learning
Which of the following growth curve model would show the largest degree of overfitting, given a sample of 15 participants across 5 time points?



Check your learning
Why shouldn’t we just choose a model with the lowest in-sample prediction error?



Cross Validation

Check your learning
Why does cross-validation, compared to in-sample MSE, give a better estimate of the out-of-sample prediction error?



Information Criterion

Check your learning

Which of the following two models are nested?

M1: mathach ~ meanses + ses_cmc + sector + (1 | ID)

M2: mathach ~ sector + (1 | ID)

M3: mathach ~ meanses + sector + (ses_cmc | ID)




Check your learning

Which of the following model is the best based on AIC?

M1: mAIC = 1203, cAIC = 1037

M2: mAIC = 1202, cAIC = 1000

M3: mAIC = 1210, cAIC = 1055