Generalized Estimating Equations, Second Edition
Inclusion of code for many of the analyses is an excellent feature. Also, the number of exercises increased significantly …. For those who want to use this book in the classroom, including me, having extra exercise sets is certainly a welcome addition. It can serve as supplemental reading in longitudinal data analysis classes as well. Praise for the First Edition: The book contains challenging problems in exercises and is suitable to be a textbook in a graduate-level course on estimating functions. The references are up-to-date and exhaustive.
I find it to be a good reference text for anyone using generalized linear models GLIM. The authors do a good job of not only presenting the general theory of GEE models, but also giving explicit examples of various correlation structures, link functions and a comparison between population-averaged and subject-specific models. Furthermore, there are sections on the analysis of residuals, deletion diagnostics, goodness-of-fit criteria, and hypothesis testing.
Generalized Estimating Equations
Good data-driven examples that give comparisons between different GEE models are provided throughout the book. Perhaps the greatest strength of this book is its completeness. It is a thorough compendium of information from the GEE literature. I believe that it serves as a valuable reference for researchers, teachers, and students who study and practice GLIM methodology. This book is easy to read, and it assumes that the reader has some background in GLM.
Many examples are drawn from biomedical studies and survey studies, and so it provides good guidance for analysing correlated data in these and other areas. Resource Updated Description Instructions K downloads. We provide complimentary e-inspection copies of primary textbooks to instructors considering our books for course adoption. Learn More about VitalSource Bookshelf. CPD consists of any educational activity which helps to maintain and develop knowledge, problem-solving, and technical skills with the aim to provide better health care through higher standards.
It could be through conference attendance, group discussion or directed reading to name just a few examples. We provide a free online form to document your learning and a certificate for your records. Already read this title? Please accept our apologies for any inconvenience this may cause. Exclusive web offer for individuals. Add to Wish List. It is a consistent estimator of the covariance matrix of if the mean model and the working correlation matrix are correctly specified. It has the property of being a consistent estimator of the covariance matrix of , even if the working correlation matrix is misspecified—that is, if.
See Zeger, Liang, and Albert , Royall , and White for further information about the robust variance estimate. In computing , and are replaced by estimates, and is replaced by the estimate.
Description
If the responses are binary that is, they take only two values , then there is an alternative method to account for the association among the measurements. The alternating logistic regressions ALR algorithm of Carey, Zeger, and Diggle models the association between pairs of responses with log odds ratios, instead of with correlations, as ordinary GEEs do. For binary data, the correlation between the j th and k th response is, by definition,.
The joint probability in the numerator satisfies the following bounds, by elementary properties of probability, since:. The correlation, therefore, is constrained to be within limits that depend in a complicated way on the means of the data. The ALR algorithm seeks to model the logarithm of the odds ratio, , as.
The parameter can take any value in with corresponding to no association. The log odds ratio, when modeled in this way with a regression model, can take different values in subgroups defined by. For example, can define subgroups within clusters, or it can define "block effects" between clusters. You specify a GEE model for binary data that uses log odds ratios by specifying a model for the mean, as in ordinary GEEs, and a model for the log odds ratios.
"Generalized Estimating Equations, second www.newyorkethnicfood.com" by James W Hardin
You can use any of the link functions appropriate for binary data in the model for the mean, such as logistic, probit, or complementary log-log. The ALR algorithm alternates between a GEE step to update the model for the mean and a logistic regression step to update the log odds ratio model. Upon convergence, the ALR algorithm provides estimates of the regression parameters for the mean, , the regression parameters for the log odds ratios, , their standard errors, and their covariances. Specifying a regression model for the log odds ratio requires you to specify rows of the z -matrix for each cluster and each unique within-cluster pair.
The supported keywords and the resulting log odds ratio models are described as follows. In this model, the log odds ratio is a constant for all clusters and pairs. The parameter is the common log odds ratio. Each cluster is parameterized in the same way, and there is a parameter for each unique pair within clusters.
If a complete cluster is of size , then there are parameters in the vector. For example, if a full cluster is of size 4, then there are parameters, and the z -matrix is of the form. The elements of correspond to log odds ratios for cluster pairs in the following order:. The argument variable is a variable name that defines the "block effects" between clusters. The log odds ratios are constant within clusters, but they take a different value for each different value of the variable.
Within each cluster, PROC GENMOD computes a log odds ratio parameter for pairs having the same value of variable for both members of the pair and one log odds ratio parameter for each unique combination of different values of variable. There are two log odds ratio parameters for this model. Pairs having the same value of variable correspond to one parameter; pairs having different values of variable correspond to the other parameter.
For example, if clusters are hospitals and subclusters are wards within hospitals, then patients within the same ward have one log odds ratio parameter, and patients from different wards have the other parameter. Each observation in the data set corresponds to one row of the z -matrix. You must specify the ZDATA data set as if all clusters are complete—that is, as if all clusters are the same size and there are no missing observations. The ZDATA data set has observations, where is the number of clusters and is the maximum cluster size.
If the members of cluster are ordered as , then the rows of the z -matrix must be specified for pairs in the order. If there are columns variables in variable-list , then there are log odds ratio parameters. If you specify this option, the data from the ZDATA data set are sorted within each cluster by variable1 and variable2. You specify z -matrix data exactly as you do for the ZFULL case, except that you specify only one complete cluster.
The z -matrix for the one cluster is replicated for each cluster. The number of observations in the ZDATA data set is , where is the size of a complete cluster a cluster with no missing observations. The number of rows specified is , where is the size of a complete cluster a cluster with no missing observations.
Define the quasi-likelihood under the independence working correlation assumption, evaluated with the parameter estimates under the working correlation of interest as. Pan notes that QIC is appropriate for selecting regression models and working correlations, whereas is appropriate only for selecting regression models. See McCullagh and Nelder and Hardin and Hilbe for discussions of quasi-likelihood functions. The contribution of observation in cluster to the quasi-likelihood function evaluated at the regression parameters is given by , where is defined in the following list.
These are used in the computation of the quasi-likelihood information criteria QIC for goodness of fit of models fit with GEEs. Note that the definition of the quasi-likelihood for the negative binomial differs from that given in McCullagh and Nelder