+ - 0:00:00
Notes for current slide
Notes for next slide

Review of Regression Analysis

PSYC 575

Mark Lai

University of Southern California

2020/08/04 (updated: 2022-08-27)

1 / 17

Statistical Model

2 / 17

Statistical Model

A set of statistical assumptions describing how data are generated

  • Deterministic/fixed component

Yi=β0+β1X1i+β2X2i+

  • Stochastic/random component

Yi=β0+β1X1i+β2X2i++ei eiN(0,σ)

2 / 17
  • It's only a review, so I won't go deep.
  • You may check out the sections in the book by Gelman et al.
  • Model in OpenBoard
  • Statistical notation
    • Notation for normal distribution
    • Important for MLM

Why Regression?

3 / 17

Why Regression?

MLM is an extension of multiple regression to deal with data from multiple levels

3 / 17

Learning Objectives

Refresh your memory on regression

4 / 17

Learning Objectives

Refresh your memory on regression

  • Describe the statistical model
4 / 17

Learning Objectives

Refresh your memory on regression

  • Describe the statistical model

  • Write out the model equations

4 / 17

Learning Objectives

Refresh your memory on regression

  • Describe the statistical model

  • Write out the model equations

  • Simulate data based on a regression model

4 / 17

Learning Objectives

Refresh your memory on regression

  • Describe the statistical model

  • Write out the model equations

  • Simulate data based on a regression model

  • Plot interactions

4 / 17

R Demonstration

5 / 17

Transition to RStudio

  • Data Import
  • Explain the variables

Salary Data

From Cohen, Cohen, West & Aiken (2003)

Examine factors related to annual salary of faculty in a university department

6 / 17

Salary Data

From Cohen, Cohen, West & Aiken (2003)

Examine factors related to annual salary of faculty in a university department

  • time = years after receiving degree
  • pub = # of publications
  • sex = gender (0 = male, 1 = female)
  • citation = # of citations
  • salary = annual salary
6 / 17

Data Exploration

7 / 17

Explain what the x axis, y axis, diagonals are

Citation vs salary as an example

Data Exploration

  • How does the distribution of salary look?

  • Are there more males or females in the data?

  • How would you describe the relationship between number of publications and salary?

7 / 17

Explain what the x axis, y axis, diagonals are

Citation vs salary as an example

Simple Linear Regression

Sample regression line

Confidence intervals

Centering

8 / 17
  • Regression line is only a sample estimate; there is uncertainty
  • Uncertainty measured by standard errors and confidence intervals
    • Show animations on the varying regression slopes
    • A function of sample size
  • Centering: Draw a picture on changing the x-axis
  • Interpretations: unit increase in x associated with β unit increase in y

Categorical Predictors

Dummy Coding

With k categories, one needs k1 dummy variables

The coefficients are differences relative to the reference group

9 / 17

Categorical Predictors

Dummy Coding

With k categories, one needs k1 dummy variables

The coefficients are differences relative to the reference group

9 / 17

Categorical Predictors

Dummy Coding

With k categories, one needs k1 dummy variables

The coefficients are differences relative to the reference group

Male = 0

y=β0+β1(0)=β0

10 / 17

Categorical Predictors

Dummy Coding

With k categories, one needs k1 dummy variables

The coefficients are differences relative to the reference group

Male = 0

y=β0+β1(0)=β0

Female = 1

y=β0+β1(1)=β0+β1

11 / 17

Multiple Regression

12 / 17

Partial Effects

salaryi=β0+β1pubic+β2timei+ei

13 / 17

Transition to R

Partial Effects

salaryi=β0+β1pubic+β2timei+ei

Interpretations

Every unit increase in X is associated with β1 unit increase in Y, when all other predictors are constant

13 / 17

Transition to R

Interactions

Regression slope of a predictor depends on another predictor

salary^=54238+105×pubc+964×timec+15(pubc)(timec)

14 / 17

Interactions

Regression slope of a predictor depends on another predictor

salary^=54238+105×pubc+964×timec+15(pubc)(timec)

time = 7 time_c = 0.21

salary^=54238+105×pubc+964(0.21)+15(pubc)(0.21)=54440+120×pubc

14 / 17

Interactions

Regression slope of a predictor depends on another predictor

salary^=54238+105×pubc+964×timec+15(pubc)(timec)

time = 7 time_c = 0.21

salary^=54238+105×pubc+964(0.21)+15(pubc)(0.21)=54440+120×pubc

time = 15 time_c = 8.21

salary^=54238+105×pubc+964(8.21)+15(pubc)(8.21)=62152+228×pubc

14 / 17

Interactions

Regression slope of a predictor depends on another predictor

salary^=54238+105×pubc+964×timec+15(pubc)(timec)

time = 7 time_c = 0.21

salary^=54238+105×pubc+964(0.21)+15(pubc)(0.21)=54440+120×pubc

time = 15 time_c = 8.21

salary^=54238+105×pubc+964(8.21)+15(pubc)(8.21)=62152+228×pubc

15 / 17

modelsummary::msummary()

M3 + Interaction
(Intercept) 54238.1
(1183.0)
pub_c 104.7
(98.4)
time_c 964.2
(339.7)
pub_c × time_c 15.1
(17.3)
Num.Obs. 62
R2 0.399
R2 Adj. 0.368
AIC 1291.8
BIC 1302.4
Log.Lik. −640.895
F 12.817
RMSE 7465.67
16 / 17

Summary

Concepts

  • What is a statistical model

  • Linear/Multiple Regression

    • Centering

    • Categorical predictor

    • Interpretations

    • Interactions

Try replicating the examples in the Rmd file

17 / 17

Statistical Model

2 / 17
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow