General data analysis using generalized linear models - a 3-day workshop
25, 26, 27 July 2012
Basement Computer Lab, Humanities Bridgeford Street building, University of Manchester
(Lunch and refreshments will be served in the foyer)
Speaker: Graeme Hutcheson, School of Education
Generalized Linear Models (GLMs) offer great advantages to students as they provide a relatively
simple, but powerful method for analyzing a great variety of data. These methods replace the out-dated and piecemeal approach often taught in basic statistics courses with
one based on solid theoretical foundations that can be extended to many different situations.
This course introduces the GLM
as a technique for analyzing a wide variety of data,
collected using a number of designs. The course
explains how to describe research questions in the
form of simple models that can be directly input
into a statistical analysis package. These models
are used to analyze regression-type problems and also
reproduce (or replace) the `standard' hypothesis tests for parametric and nonparameteric data (for example, ANOVA, ANCOVA, Kruskal-Wallis, Friedman, etc.). Issues
surrounding model diagnostics, transformation and
variable selection are also dealt with.
Programme
Day 1: An introduction to model-based statistics and how they can be applied to a wide variety of research designs and analytical problems. The basic methods are illustrated using models for continuous data. The use of the techniques for "grouped designs" (ANOVA, ANCOVA etc.) is dealt with through the application of dummy-variable coding techniques. The lectures are complimented by exercises in the computer lab using the statistical package R and the R-commander GUI. Users do not need any prior knowledge of R or the R-commander.
9.00 Coffee and registration (Foyer)
9.30 GLMs: an introduction to analysis
11.00 Coffee
11.30 Computer exercise: R and Rcmdr
1.00 Lunch
2.00 GLMs and OLS regression
3.30 Tea
4.00 Computer exercise: regression
5.30 End of day one
Day 2: This day deals with extending the models for continuous data to categorical data. The use of logits is explained using binary categorical variables and this is then extended to multi-category ordered and unordered variables. The use of the techniques for "grouped designs" (Mann-Whitney, Kruskal-Wallis, Fiedman etc.) is dealt with through the application of dummy-variable coding techniques. These sessions are complimented by exercises in the computer lab using the statistical package R and the R-commander GUI.
9.30 Analysing categorical data: logit models
11.00 Coffee
11.30 Computer exercise: logistic regression
12.00 Lunch
1.00 Computer exercise: logistic regression (continued)
2.00 Modelling ordered and unordered data
3.30 Tea
4.00 Computer exercise: PO and MNL models
5.30 End of day two
Day 3: This day deals with methods for checking model assumptions using diagnostics and also investigating ways in which models might be improved using data transformation techniques. The problem of variable selection is also dealt with and a procedure using multiple models selected from a restricted set of candidates is proposed to address some of the problems associated with multi-collinear datasets. The course finishes with a discussion where participants are encouraged to reflect on their use of statistics and identify areas that may require further work.
9.30 Model diagnostics and data transformation
11.00 Coffee
11.30 Computer exercise: diagnostics/transformations
1.00 Lunch
2.00 Model selection
3.30 Tea
4.00 Conclusion and discussion
5.30 End
A full set of documentation (including exercises and datasets) is now available.