Generalized Estimating Equations: Concept and Application 

Sambodhi > Blog > Data > Generalized Estimating Equations: Concept and Application 
Posted by: Kultar Singh
Category: Data
generalized estimating equations

The GEE can be used to analyze longitudinal data, repeat measurements, or clustered observations. Generalized linear models are extensions of generalized equational models (GLM). 

Generalized linear models assume that data are independent, but this is not necessarily true when data are longitudinal. There is a high probability that observations within an individual will be similar to findings between individuals.  

Hence, the question becomes how we should handle this situation? One option is to use a generalized linear mixed model with a random slope and intercept terms for every observation. An outcome will be based on a specific observation and an impact of a particular variable. The method is not helpful if you are interested in calculating the marginal impact, i.e., the effect of a variable on the outcome in the population as a whole.  

Hence, if you are looking to answer these questions about the population, you must construct the model with a generalized linear structure by using generalized estimation equations (GEE). In this method, the effect of population size is produced, which means that observational similarities between individuals are more common than among people.  

Advantages and Limitations:  

  • Generalized Estimating Equations is a technique to analyze clustered or longitudinal data. It is generally used with non-normal data, such as count or binary data. A model consists of a set of equations that must be solved to calculate parameters’ estimations.   
     
  • A significant advantage of GEE is the fact that it does not require multivariate distributions. Additionally, it is simpler to compute and more straightforward than MLE to process categorical information. 
      
  • Limitations of likelihood-based methods include insufficient ability to test the fit of models, compare models, or infer parameters. The likelihood functions are invalid since the GEE doesn’t fully define the joint distribution. Hence, many do not view it as a model but rather as an estimation technique.
  • Furthermore, empirically derived standard errors are less than actual ones unless sample sizes are large.  

GEE estimation   

When the dependent variables exhibit normal distributions, and the response correlation is presumed to be zero, GEE estimates are similar to those of Ordinary Least Squares.  

In terms of interpretation, GEE tells us about the change in mean response for every unit increase in a covariate across populations.  

GEE and multilevel model  

The GEE model differs from the multilevel model used in longitudinal or clustered data analysis.  

Its main distinction is its marginal modeling. The objective is to simulate the population as a whole. Models with mixed effects and multiple levels are essentially subject-specific or conditional designs. They permit us to estimate various parameters for each cluster or subject. 

Additionally, parameter estimates vary based on the cluster or subject. Mixed-effect models can provide a population level estimate, but those estimates would be averages over individual units or subjects.   

Kultar Singh – Chief Executive Officer, Sambodhi

Author: Kultar Singh

Leave a Reply