methods@manchester: research methods in the social sciences

General data analysis using generalized linear models - a 3-day workshop

25, 26, 27 July 2012

Basement Computer Lab, Humanities Bridgeford Street building, University of Manchester

(Lunch and refreshments will be served in the foyer)

Speaker: Graeme Hutcheson, School of Education

Generalized Linear Models (GLMs) offer great advantages to students as they provide a relatively simple, but powerful method for analyzing a great variety of data. These methods replace the out-dated and piecemeal approach often taught in basic statistics courses with one based on solid theoretical foundations that can be extended to many different situations.

This course introduces the GLM as a technique for analyzing a wide variety of data, collected using a number of designs. The course explains how to describe research questions in the form of simple models that can be directly input into a statistical analysis package. These models are used to analyze regression-type problems and also reproduce (or replace) the `standard' hypothesis tests for parametric and nonparameteric data (for example, ANOVA, ANCOVA, Kruskal-Wallis, Friedman, etc.). Issues surrounding model diagnostics, transformation and variable selection are also dealt with.

Programme

Day 1: An introduction to model-based statistics and how they can be applied to a wide variety of research designs and analytical problems. The basic methods are illustrated using models for continuous data. The use of the techniques for "grouped designs" (ANOVA, ANCOVA etc.) is dealt with through the application of dummy-variable coding techniques. The lectures are complimented by exercises in the computer lab using the statistical package R and the R-commander GUI. Users do not need any prior knowledge of R or the R-commander.

9.00 Coffee and registration (Foyer)
9.30 GLMs: an introduction to analysis
11.00 Coffee
11.30 Computer exercise: R and Rcmdr
1.00 Lunch
2.00 GLMs and OLS regression
3.30 Tea
4.00 Computer exercise: regression
5.30 End of day one

Day 2: This day deals with extending the models for continuous data to categorical data. The use of logits is explained using binary categorical variables and this is then extended to multi-category ordered and unordered variables. The use of the techniques for "grouped designs" (Mann-Whitney, Kruskal-Wallis, Fiedman etc.) is dealt with through the application of dummy-variable coding techniques. These sessions are complimented by exercises in the computer lab using the statistical package R and the R-commander GUI.

9.30 Analysing categorical data: logit models
11.00 Coffee
11.30 Computer exercise: logistic regression
12.00 Lunch
1.00 Computer exercise: logistic regression (continued)
2.00 Modelling ordered and unordered data
3.30 Tea
4.00 Computer exercise: PO and MNL models
5.30 End of day two

Day 3: This day deals with methods for checking model assumptions using diagnostics and also investigating ways in which models might be improved using data transformation techniques. The problem of variable selection is also dealt with and a procedure using multiple models selected from a restricted set of candidates is proposed to address some of the problems associated with multi-collinear datasets. The course finishes with a discussion where participants are encouraged to reflect on their use of statistics and identify areas that may require further work.

9.30 Model diagnostics and data transformation
11.00 Coffee
11.30 Computer exercise: diagnostics/transformations
1.00 Lunch
2.00 Model selection
3.30 Tea
4.00 Conclusion and discussion
5.30 End

A full set of documentation (including exercises and datasets) is now available.