A friendly Introduction to Longitudinal Analysis with Latent Variables

emontenegro1@csustan.edu

Esteban Montenegro-Montenegro, PhD

Psychology and Child Development

My aims today

  • Introduce the concept of latent variable.
  • Explain important concepts to jump into Structural Equation Modeling (SEM)
  • Expose applied examples in Longitudinal SEM.

What is a latent variable?

Bollen & Hoyle (2012):

  • Hypothetical variables.
  • Traits.
  • Data reduction strategy?
  • Classification strategy?
  • A variable in regression analysis which is, in principle
    unmeasurable.
  • Have you thought how to measure intelligence?

  • What about the concept of “good performance”?

  • What does good health look like ? Is it a concept?

Let’s see the graphics convention

What is a latent variable?

But wait… what is statistical model?

Important

Models are theoretical reductions, they intend to explain how the DATA is produced by Nature.

What is the name of this type of models?

  • Latent variables like the ones mentioned today belong to a family of models named Structural Equation Modeling (SEM) .

  • The word equation is there because we will estimate several equations at the same type.

  • Structural implies a set of equations we have to solve to unveil the relationship between variables.

  • Modeling means we will create models based on theory.

  • SEM was created to “confirm” theory. It is thought as a confirmatory approach.

The family’s foundation

  • SEM is known as a “covariance-based approach” this means we will use the concept of variance:
\[\begin{equation} \sigma^2 = \frac{\sum(x_{i}-\bar{x})}{n-1} \end{equation}\]
  • Also the concept of covariance is relevant here:
\[\begin{equation} cov(X,Y) = \frac{\sum(x_{i}-\bar{x})(y_{i} -\bar{y})}{n-1} \end{equation}\]
  • And why not, let’s throw correlation in here:
\[\begin{equation} cor(X,Y) = \frac{cov(X,Y)}{\sigma x \sigma y} \end{equation}\]

The family’s foundation

SEM relates to theories and ideas

Henseler (2020):

SEM relates to theories and ideas II

Reflective factors // Formative factors

Assumptions

  • Depends on how we estimate the model.
  • Multivariate normal data generating process.
  • Local independence: only the latent factor explains the indicators.
  • Sample size should be large enough to estimate the model.
  • Multicollinearity is a problem.

Let’s formally define the SEM model

\[\begin{equation} \Sigma = \Lambda \Psi \Lambda' + \Theta \end{equation}\]

Where:

  • \(\Sigma\) = The estimated covariance matrix.

  • \(\Psi\) = Matrix with covariance between latent factors.

  • \(\Lambda\) = Matrix with factor loadings.

  • \(\Theta\) = Matrix with unique observed variances.

  • In SEM we also have another model for the implied means:

\[\begin{equation} y = \tau + \Lambda\eta + \epsilon \end{equation}\]
  • Does it look familiar to you?

Let’s formally define the SEM model II

Just a remark about model fit

Positive and Negative example

Longitudinal SEM Models

Longitudinal SEM

  • In Longitudinal SEM we can model changes overtime in different ways.

  • It depends on the question the researcher wants to address.

    • Longitudinal Confirmatory Factor Analysis (LCFA)

    • Panel models.

    • Latent growth models.

    • Spline models.

    • Random Intercept Panel Model.

    • Latent Change Score Model.

Longitudinal CFA: Checking assumptions

Little (2013)

Configural Invariance Model

Longitudinal CFA II

Weak Invariance Model

Longitudinal CFA III

Strong Invariance Model

Panel Models

Little (2013)

Panel Model Example

Random Intercept Panel Models

Mulder & Hamaker (2020):

RI Model

Random Intercept Panel Models II

Asebedo et al. (2022)

RI Model Example

Latent Growth Models

  • Latent Growth Models are an extension of Multilevel Growth Models.
  • LGM is useful to predict trajectories, and add more paths that can explain the growth, or the growth or change might be a predictor.

Mean structure model:

\[\begin{equation} y_{ti} = \lambda_{1t} \eta_{1i} + \lambda_{2t} \eta_{2i}+ \epsilon \end{equation}\]

The most relevant part is the latent mean:

\[\begin{align} \eta_{1t} &= \alpha_{1} + \zeta_{1t}\\ \eta_{2t} &= \alpha_{2} + \zeta_{2t}\\ \end{align}\]

Latent Growth Model

Latent Growth Models II

  • In Growth Models we’ll get a trajectory for each participant or observation. They are nested in time.

Estimated Trajectories

Latent Growth Models: Another Example

Math Score Growth Model

Math Score Growth Model Trajectory

Why should we use SEM ?

  • SEM has good properties:
    • It helps to model several hypothesis at the same time.
    • It has flexibility to estimate multigroup models.
    • You have several estimators to choose.
    • The model accounts for the error variance in your items.
    • You can model non-linearity (we didn’t talk about it).
    • You can estimate multilevel models and divide the variance (RI panel model).
    • It is fun!

Why should we use SEM ?

  • SEM has also some problems:
    • Requires a fair number of observations in most cases (frequentist approach).
    • People are afraid of using it! I don’t know why?
    • There is a chance of misfit in the model. Models with multiple parameters tend to be prone to misspecification.
    • Large models might be far from the true data generating process, at least there is more chance.

When should we use SEM?

  • We should use SEM when the data generating process is truly latent. This applies to many variables in psychology, sociology and social sciences. But also in health sciences.

  • SEM is good at treating missing data under the right assumptions.

  • In fact, the missing values can be imputed when treated as a latent variable.

What is left for a next talk ?

  • How the estimation of SEM takes place.
  • Bayesian Inference vs. Frequentist Inference.
  • More about different estimators.
  • Advanced modeling approaches.
  • Missing data modeling.
  • Latent Class Analysis.
  • Item Response Theory Models (IRT).
  • More on Dynamic Modeling.

And…more memes for sure.

Thanks for not falling asleep!

References

Asebedo, S. D., Quadria, T. H., Chen, Y., & Montenegro-Montenegro, E. (2022). Individual differences in personality and positive emotion for wealth creation. Personality and Individual Differences, 199, 111854.
Bollen, K. A., & Hoyle, R. H. (2012). Latent variables in structural equation modeling. In Handbook of structural equation modeling (pp. 56–67). The Guilford Press.
Henseler, J. (2020). Composite-based structural equation modeling: Analyzing latent and emergent variables. Guilford Publications.
Little, T. D. (2013). Longitudinal structural equation modeling. Guilford press.
Mulder, J. D., & Hamaker, E. L. (2020). Three Extensions of the Random Intercept Cross-Lagged Panel Model. Structural Equation Modeling: A Multidisciplinary Journal, 1–11. https://doi.org/10.1080/10705511.2020.1784738