10 views
12 min read
#mixed models

Introduction to Multilevel Models

Examples of multilevel data

  • Effect of school environment on exam results
    • Design: hierarchical, where the examination results of a random sample of students within a random sample of schools are compared
  • Influence of race and sec on fetal heartbeat during pregnancy
    • Design: repeated measurements on different gestational ages during pregnancy, where the gestational ages were not the same for all women
  • Hierarchical Structure: Both the school environment and multi-centre hypertension trial examples involve data nested within two levels (students within schools, patients within centres). This structure is typical in many research areas and requires special analytical techniques to account for the dependencies within levels properly.
  • Repeated Measures: The fetal heartbeat example involves repeated observations over time, another multilevel data form. Here, the correlation is temporal for measurements taken from the same individual at different times.

Hierarchical Structure of Data

Multilevel or hierarchical data arise when observations are nested within higher-level units, forming a structure of levels. Common examples include:

  • Children within (classrooms within) schools: Here, the individual children are the lowest level (Level 1), classrooms are an intermediate level, and schools represent a higher level (Level 2 or 3, depending on whether classrooms are considered a separate level).
  • Patients within centers: Patients (Level 1) are nested within treatment or healthcare centers (Level 2).
  • Measurements within patients: When multiple measurements are taken from the same patient over time, the individual measurements (Level 1) are nested within patients (Level 2).

Variation at All Levels

Multilevel data exhibit variation at each level of the hierarchy. For instance, students within the same classroom may perform differently on tests (variation at Level 1), and average classroom performances may vary across schools (variation at Level 2). Understanding and modeling this variation at different levels is crucial for accurate analysis.

Units Within a Level Expected to Be Correlated

Observations within the same group (or level) are expected to be more similar to each other than to observations in different groups due to shared environment, genetics, or other common factors. For example, children in the same school might have similar educational outcomes due to shared teachers, resources, and school culture.

Variables Can Be Measured at Different Levels

In multilevel data, variables (also called predictors or covariates) can pertain to different levels of the hierarchy, which allows for a rich analysis of how factors at various levels influence the outcome of interest.

  • Level 2 Variables are characteristics of the higher-level units. Examples include:
    • The type of school (mixed vs. single-gender), which is a characteristic of the school (Level 2).
    • Whether a patient is treated at a university hospital or a community hospital, a characteristic of the treatment center (Level 2).
  • Level 1 Variables are attributes of the individual units within the higher-level units. Examples include:
    • The reading ability of a child at intake, a characteristic of the individual child (Level 1).
    • The gender of a patient, an individual characteristic (Level 1).

Example

The London Schools example illustrates a comprehensive study on the relationship between several factors and academic achievement, as measured by exam scores, among students in Inner London schools. This study, conducted by Goldstein, Rasbash, et al. in 1993, covers 4,059 children across 65 schools. It explores complex questions regarding educational outcomes using multilevel data, showcasing how various analytical approaches can be employed to understand the influences on exam scores. Let’s break down the key components of this example:

Study Objectives

  • Main Question: Is there a relationship between exam achievement and various factors such as intake achievement level, pupil gender, school type, and overall exam achievement of the school?
  • Subquestion: Specifically, does the performance differ for girls attending mixed versus all-girls schools?

Variables in the Dataset

  • School ID: Unique identifier for each school.
  • Student ID: Unique identifier for each student.
  • Normalized Exam Score: The primary outcome variable, representing students' exam performance adjusted to a common scale.
  • Standardized LR (London Reading) Test Score: A standardized measure of reading ability.
  • Student Gender: The gender of the student.
  • School Gender: Indicates whether the school is mixed, boys-only, or girls-only.
  • School Average of Intake Score: The average initial assessment score of all students upon entering the school.
  • Student-Level Verbal Reasoning (VR) Score Category at Intake: Categorization of students based on their verbal reasoning scores at intake.
  • Category of Students’ Intake Score (Averaged): A group measure, likely representing the overall intake score distribution within schools or classrooms.

Analytical Approaches

To analyze the relationship between exam scores and various predictors, several analytical strategies are proposed, each with its implications for understanding the data:

  1. Linear Regression (Aggregated Data): This approach uses the mean exam score and mean LR test score for each school. It treats each school as a single data point, potentially overlooking within-school variability.

    Disadvantages:

    • Every school (regardless of sample size) given equal weight
    • School-level variables possible, but not child-level variables
    • We can only make inference at school level, not child level
    • Possibility of Ecological Fallacy: The ecological fallacy occurs when inferences about individual-level behavior are drawn from group-level data

    Screenshot_2024-04-12_at_15.21.25.png

  2. Linear Regression (Disaggregated Data): This method analyzes all individual student scores together, without explicitly accounting for the nesting of students within schools, which may ignore important school-level influences.

    Screenshot_2024-04-12_at_15.26.49.png

    Disadvantages:

    • Inflates Sample Size, Especially for Level-2 Variables

    Disaggregating data means that all individual-level observations are treated as if they were independent. In the context of hierarchical data, this artificially inflates the sample size because it counts each student as a separate data point without accounting for the fact that students are nested within schools. This inflation is especially problematic for variables at Level-2 (e.g., school-level variables) because the actual number of independent observations at this level is the number of schools, not the number of students.

    • Underestimation of Standard Errors (SEs) for Level-2 Variables

    Because the sample size is artificially inflated, the standard errors for variables that describe groups (Level-2 variables, such as school type) tend to be underestimated. This underestimation of standard errors leads to p-values that are too small and confidence intervals that are too narrow. Consequently, there is an increased risk of committing Type I errors, where researchers might falsely conclude that a significant relationship exists when it does not.

    • Standard Errors for Level-1 Variables May Be Over- or Underestimated

    For variables at the individual level (Level-1), the impact of disaggregation on standard errors can be variable. In some cases, standard errors might be underestimated, leading to the same issues as with Level-2 variables. In other cases, they might be overestimated if the model fails to account for the extra variability introduced by ignoring the grouping structure.

    Screenshot_2024-04-12_at_15.32.39.png

  3. Linear Regression per School (Stratified Analysis): Separate regressions are run for each school, allowing for school-specific estimates but complicating the synthesis of results across schools.

    Disadvantages of the Approach:

    • Combining Results: With 65 separate regressions, synthesizing the results into a coherent conclusion about the overall relationship across all schools is challenging. Each school's regression gives a piece of the puzzle, but understanding the full picture requires careful consideration.
    • Equal Weighting: In summarizing across schools, each school's results are given equal weight, regardless of differences in sample sizes or variability within schools. This can skew the overall interpretation if, for instance, smaller schools with more extreme results unduly influence the average.
    • Correctness of Standard Error: There's a question of whether the standard error for parameter estimates (intercepts and slopes) is accurately captured in this analysis. Since each regression is run separately, the standard errors reflect within-school variability but might not adequately account for between-school variability.
    • Variable Levels: This approach allows for the analysis of child-level variables within each school but does not directly accommodate variables at the school level, such as school type or overall school performance. This limitation means potentially important predictors of the outcome variable are not included in the models.

    Screenshot_2024-04-12_at_15.51.41.png

  4. Linear Regression with Main Effects and Interactions (Fully Stratified Model): This model includes all students and allows for different intercepts and slopes across schools, offering a nuanced view that accounts for both individual and school-level variability.

    Advantage:

    • Inclusion of Multiple Variable Levels: This model's primary advantage is its capacity to include variables at both the individual (child) level and the group (school) level. This is significant because it enables the analysis to account for the effects of individual characteristics (like a student's verbal reasoning score) and group characteristics (like school type) on the outcome variable simultaneously.
    • Potential Normal Distribution of Residuals: Another advantage might be that residuals—the differences between observed and predicted values—are likely normally distributed with constant variance around the regression lines for each school. This assumption of homoscedasticity (constant variance) and normality in residuals is crucial for the validity of regression analysis, as it supports the reliability of confidence intervals and hypothesis tests conducted on the regression coefficients.

    Disadvantages:

    • Complexity in Interpretation with Multiple Intercepts and Slopes: While aiming to obtain a single intercept and slope for the entire dataset, this model ends up producing 65 different intercepts (one for each school, assuming one is chosen as a reference category) and 64 interaction terms (differences in slopes from the reference school). This vastly increases the complexity of the model and makes interpretation challenging.
    • Reference Category Issues: Selecting a reference category becomes problematic with so many schools. The choice of reference school can affect the interpretation of the interaction terms, as these terms represent differences from the reference category.
    • Generalizability Concerns: Since the model is tailored to the specific characteristics and variances of the 65 schools in the dataset, generalizing the findings beyond these schools is difficult. The model is essentially overfitted to this particular set of schools, limiting its applicability to other contexts.
    • Increased Demand on Degrees of Freedom: The model uses an additional 128 degrees of freedom (df) for the intercepts and slopes (64+64, excluding the reference category). In statistical modeling, each parameter estimated consumes one degree of freedom, and using too many degrees of freedom can lead to overfitting, where the model describes random error or noise rather than the underlying relationship.
  5. Linear Mixed Model: This approach models both fixed effects (e.g., student gender, LR score) and random effects (e.g., variation across schools), accommodating the hierarchical structure of the data and providing a holistic understanding of the factors influencing exam scores.

    Advantages of Linear Mixed Models:

    • Correct Sample Size Utilization: LMMs correctly handle the data's hierarchical structure, accounting for the correlation among students within the same schools. This approach provides accurate standard errors (SEs), p-values, and confidence intervals (CIs), leading to more reliable statistical inferences.
    • Simplified Model Complexity: Unlike models that require a separate intercept and slope for each school (leading to a very high number of parameters), LMMs can capture the variability across schools using random effects. This significantly reduces the number of parameters needed (e.g., eliminating the need for 64 main effects and interactions for each school beyond the reference).
    • Capturing Variability with Variance Components: The differences between schools are modeled as variance components in random effects. This method allows the model to account for school-level variability without explicitly fitting separate parameters for each school.
    • Inclusion of Multiple Levels of Variables: LMMs can incorporate variables at both the student level (Level 1) and the school level (Level 2) simultaneously, allowing for a comprehensive analysis that considers the influence of factors across levels.
    • Modeling Interactions: The framework supports modeling interactions between child- and school-level variables, providing insights into how the effects of predictors may differ across different contexts.
    • Handling Missing Data: LMMs are well-suited to dealing with missing data, particularly in longitudinal studies where some data points might be missing across time points.

    Components of Mixed Models:

    • Fixed Effects: These are the effects that are consistent across all units of analysis. In this context, fixed effects could include the overall intercept, overall slope for London Reading Test (LRT) scores, gender, type of school, achievement level of the school, etc. The model seeks to estimate coefficients for these fixed effects to understand how they influence the exam scores on average across all schools.
    • Random Effects: These effects allow for variation across higher-level units (schools, in this case). A random intercept for each school allows the starting point (intercept) of the regression line to vary from school to school. Similarly, a random slope for LRT allows the impact of LRT scores on exam performance to vary across schools. The variability in intercepts and slopes across schools is modeled as coming from normal distributions, simplifying the model by estimating variances of these distributions rather than fitting separate intercepts and slopes for each school.

    Summary:

    Linear Mixed Models offer a powerful and flexible approach to analyzing hierarchical data, like the London Schools dataset. By efficiently handling the nested structure, incorporating variables at different levels, and reducing the complexity of the model through random effects, LMMs provide nuanced insights into the factors affecting student performance across schools.

    Conclusion

    The linear mixed-effects model provides valuable insights into the factors influencing academic performance in London schools. By considering both individual and school-level variables, the analysis sheds light on the complex interplay of influences shaping student outcomes. This statistical exploration not only contributes to academic discussions but also has practical implications for educators, policymakers, and researchers aiming to enhance educational achievement.