Summer School 2018
The sixth methods@manchester Summer School took place at The University of Manchester from 2 July - 13 July 2018. The following courses were taught:
Generalized Linear Models: a comprehensive system of analysis and graphics using R and the Rcommander
2 - 6 July 2018
This is a general course in statistics based on generalized linear models and is designed to provide a relatively complete course in data analysis for post-graduate students. A number of analytical techniques are covered , including OLS and logistic regression, Poisson, proportional-odds and multinomial logit models, enabling a wide range of data to be modelled. Graphical displays are extensively used, making the task of interpretation much simpler.
A general approach is used which deals with data (coding and manipulation), the formulation of research hypotheses, the analysis process and the interpretation of results. Participants will also learn about the use of contrast coding for categorical variables, interpreting and visualising interactions, regression diagnostics and data transformation and issues related to multicollinearity and variable selection.
The software package R is used in conjunction with the R-commander and the R-studio. These packages provide a simple yet powerful system for data analysis. No previous experience of using R is required for this course, nor is any previous experience of coding or using other statistical packages.
This course provides a number of practical sessions where participants are encouraged to analyse a variety of data and produce their own analyses. Analyses may be conducted on the networked computers provided, or participants may use their own computers; the initial sessions cover setting up the software on lap-tops (all operating systems are allowed).
The main objective of this course is to provide a general method for modelling a wide range of data using regression-based techniques. Participants will be able to select, run and interpret models for continuous, ordered and unordered data using modern graphical techniques.
Afternoon - Introduction: A system of analysis; Software: R, Rstudio and the Rcommander
Morning - Data coding, manipulation and management; defining models: representing research questions
Afternoon - Analysis: An introduction to generalized linear models; Interpretation: using effect displays
Morning - Modelling continuous data; Contrast coding: dealing with categories explanatory variables
Afternoon - Modelling count data; Including and interpreting interactions.
Morning - Modelling categories (using logit models); Modelling ordered categorial variables (proportional odds models)
Afternoon - Modelling unordered categorical variables (multinomial logit models); Exercises modelling categorical variables
Morning - Model diagnostics and data transformations (Box-Cox and Box-Tidwell); Variable selection (strategies for dealing with collinearity using limited variable models and multimodel presentations)
The course will be presented by Graeme Hutcheson.
Graeme Hutcheson is a lecturer in the Manchester Institute of Education and has published extensively in the field of regression models and the analysis of social science data.
Prior or recommended knowledge/reading/skills
There are no pre-requisites for this course as instruction is provided for all techniques. However, it will be of most use to those who are interested in modelling social science datasets (survey and quasi-experimental) and applying graphics to interpret these.
This course is suitable for PGR students, academics and researchers in all social science fields.
Agresti, A. (1996). An Introduction to Categorical Data Analysis. Wiley.
Fox, J. and Weisberg, S. (2011). An R companion to Applied Regression (second edition). Sage Publications
Harrell, F. E. (2001). Regression modelling strategies. Springer.
Hutcheson, G. & Sofroniou, N. (1999). The multivariate social scientist. Sage Publications.
Hutcheson, G. & Moutinho, L. (2008). Statistical modelling for management. Sage Publications.
2 - 6 July 2018
R is an open source programming language and software environment for performing statistical calculations and creating data visualisations. It is rapidly becoming the tool of choice for data analysts with a growing number of employers seeking candidates with R programming skills.
This course will provide you with all the tools you need to get started analysing data in R. We will introduce the tidyverse, a collection of R packages created by Hadley Wickham and others which provides an intuitive framework for using R for data analysis. Students will learn the basics of R programming and how to use R for effective data analysis. Practical examples of data analysis on social science topics will be provided.
1. R and the 'tidyverse'
This session will introduce R & RStudio and cover the basics of R programming and good coding practice. We will also discuss R packages and how to use them, with a particular focus on those that make up the 'tidyverse'. We also introduce R Markdown which will be used to report our analyses throughout the course.
2. Import and Tidy
Data scientists spend about 60% of their time cleaning and organizing data (CrowdFlower Data Science Report 2016: 6). This session will show you how to 'tidy' your data ready for analysis in R. In particular, we'll show you how to take data stored in a flat file, database, or web API, and load it into a dataframe in R. We will also talk about consistent data structures, and how to achieve them.
Together with importing and tidying, transforming data is one of the key element of data analysis. We will cover subsetting your data (to narrow your focus), creating new variables from existing ones, and calculating summary statistics.
Data visualisation is what brings your data to life. This session will provide you with the skills and tools to create the perfect (static and interactive) visualisation for your data.
5. Bringing it all together
In this last session we review all we have learned on this course, and think about how we can bring it all together in dynamic outputs, such as interactive documents, plots, and Shiny applications.
After this course, users should be able to:
- implement the basic operations of R;
- read data in multiple forms;
- clean, manipulate, explor and visualise data in R
The course will be taught by Dr Reka Solymosi and Dr Henry Partridge.
Dr Reka Solymosi is a lecturer in quantitative criminology at the University of Manchester in the United Kingdom. Before that she was a data analyst researching issues around transport crime and policing at Transport for London. Her research interests are around crowdsourced data collection, transport crime, and perception of crime and place. She uses R in both teaching and research, and co-runs the R at University of Manchester (RUM) group.
Dr Henry Partridge is the Manager of the Trafford Data Lab which supports decision-making in Trafford, Greater Manchester by revealing patterns in data through visualisation. Henry is currently involved in a Horizon 2020 project which promotes the use of open linked statistical data to improve the delivery of public services. Henry has strong research and analytical skills with particular expertise in R programming, data visualisation, and spatial analysis.
2 - 6 July 2018
This is an introductory course, covering the concepts, methods and data analysis techniques of social network analysis. The course is based on the book "Analyzing Social Networks" by Borgatti et al. (Sage) and all participants will be issued with a copy of the book. The course begins with a general introduction to the distinct goals and perspectives of social network analysis, followed by a practical discussion of network data, covering issues of collection, validity, visualization, and mathematical/computer representation. We then take up the methods of detection and description of structural properties, such as centrality, cohesion, subgroups and positional analysis techniques. This is a hands on course largely based around the use of UCINET software, and will give participants experience of analyzing real social network data using the techniques covered in the workshop. No prior knowledge of social network analysis is assumed for this course.
The course will:
- Introduce the idea of Social Network Analysis
- Explain how to describe and visualise networks using specialist software (UCINET)
- Explain key concepts of Social Network Analysis (e.g. Cohesion, Brokerage).
- Provide hands-on training to use software to investigate social network structure
Introduction to Social Network Analysis, terminology and the software UCINET/Netdraw. Chapters 1 and 2
Morning - Collecting social network data and research design. Chapters 3 and 4
Afternoon - Data management and visualisation. Chapters 5 and 7
Morning - Multivariate techniques and whole networks. Chapters 6 and 9.
Afternoon - Centrality and ego networks. Chapters 10 and 15.
Morning - Equivalence and core-periphery. Chapter 12
Afternoon - Subgroups and two-mode networks. Chapters 11 and 13
Morning - Testing hypothesis and large networks. Chapters 8 and 14.
Chapter numbers refer to the book "Analyzing Social Networks) by Borgatti et al. (Sage). Timetable is subject to change.
Elisa Bellotti is a Senior Lecturer at The University of Manchester. Along with being part of the Sociology department, she is a member of the Mitchell Centre for Social Network Analysis, where she organises the weekly seminar series. She teaches introductory and advanced workshops in social network analysis and egonetworks, and in mixed methods in SNA. Before arriving in Manchester in 2008, she worked as research fellow at University of Turin and University of Bozen, Italy. She completed her PhD in Sociology and Methodology of Social Research in 2006 at Catholic University of Milan. Her research interests mainly focus on relational sociology and its link with other mainstream sociological theories; and on social network analysis and mixed methods. She has taken this approach in several sociological substantive areas, such as the study of intimacy and personal relationships, sociology of science, criminal networks, inter and intra organisational ties, and sociology of consumption.
Nick Crossley is Professor of Sociology at The University of Manchester. His main work using social network analysis has focused upon music worlds, social movements and covert networks. He has also written extensively about 'relational sociology', a theoretical position which advocates a focus upon networks in sociology. His most recent book is Networks of Sound, Style and Subversion: the Punk and Post-Punk Worlds of Manchester, London, Liverpool and Sheffield, 1975-1976 (Manchester University Press).
Prior or recommended knowledge/reading/skills
None required but it would be useful to read Scott, J (2000) Social Network Analysis: A Handbook. Sage.
Software to be used
UCINET and Netdraw. It is useful for participants to bring their own laptops running windows (Macs will need to have a PC emulator) and to have downloaded the software in advance. This can be done for a free period of time from Analytictech website.
9 - 13 July 2018
This is an introduction to statistical analysis of networks. While no strict prerequisites are assumed, you might find it helpful to have some basic knowledge of social network analysis beforehand. To benefit fully from the course requires a basic knowledge of standard statistical methods, such regression analysis. The course aims to give a basic understanding of and working handle on drawing inference for structure and attributes, both cross-sectionally as well as longitudinally. A fundamental notion of the course will be how the structure of observed graphs relate to various forms of random graphs. This will be developed in the context of non-parametric approaches and elaborated to analysis of networks using exponential random graph models (ERGM) and stochastic actor-oriented models. The main focus will be on explaining structure but an outlook to explaining individual-level outcomes will be provided.
The participant will be provided with several hands-on exercises, applying the approaches to a suite of real world data sets. We will use the stand-alone graphical user interface package MPNet and R. In R we will learn how to use the packages ‘sna’, ‘statnet’, and ‘RSiena’. No familiarity with R is assumed but preparatory exercises will be provided ahead of the course.
Literature we will draw on includes:
Lusher, D., Koskinen, J., Robins, G., (2013). Exponential Random Graph Models for Social Networks: Theory, Methods and Applications, Cambridge University Press, NY.
Snijders, Tom A. B., Gerhard G. van de Bunt, and Christian E.G. Steglich. 2010. “Introduction to stochastic actor-based models for network dynamics.” Social Networks 32:44-60.
MPNet can be downloaded from MelNet
The course will:
- Introduce how statistical evidence relates to social networks
- Explain how to draw inference about key network mechanisms from observations
- Provide hands-on training to use software to investigate
- social network structure
- tie-formation in cross-sectional data
- tie-formation in longitudinal data
- take into account network dependencies between individuals
Introduction to working with networks in R
Morning – Subgraphs and null distributions and ERGM rationale
Afternoon – ERGMs and dependence
Morning – ERGM: Issues and technicalities
Afternoon – SAOM: introduction to longitudinal modelling
Morning – SAOM: introduction to longitudinal modelling
Afternoon – Extensions and further issues
Morning – Influence, contagion, and outlook to further issues.
Timetable is subject to change.
The course will be taught by Dr Johan Koskinen
9 - 13 July 2018
(formerly Quantitative Longitudinal Data Analysis)
Longitudinal data (data collected multiple times from the same cases) is becoming increasingly popular due to the important insights it can bring us. For example, it can be used to track how individuals change in time and what the causes of change are. It can also be used to understand causal relationships or used as part of impact evaluation. Unfortunately, traditional models such as ordinary least squares regression are not appropriate as multiple individuals are nested in different time points. For this reason, specialised statistical models need to be learned.
In this course, you will learn the most important skills needed in order to prepare and analyse longitudinal data. We will cover statistical methods used in multiple research fields such as economics, sociology, psychology, developmental studies, marketing and biology. At the end of the course, you will be able to answer a number of different types of questions using longitudinal data: questions about causality and causal order, about changes in time and what explains it, and about the occurrence of events and their timing.
Throughout the week, we will use a combination of lecturing and applied sessions. For the applied sessions, we will use the statistical package R. R is becoming one of the leading statistical software due to its free and open source nature. In this course, you will learn how to effectively use it to answer longitudinal questions. We will cover both data management and cleaning, as well as different statistical methodologies such as regression analysis, multilevel analysis, structural equation modelling and survival analysis.
The aims of the course are:
- to gain competence in the concepts, designs and terms of longitudinal research;
- to be able to apply a range of different methods for longitudinal data analysis;
- to have a general understanding of how each method represents different kinds of longitudinal processes;
- to be able to choose a design, a plausible model and an appropriate method of analysis for a range of research questions.
Day one (half day)
Afternoon - introduction to longitudinal data; introduction to R and data cleaning.
Fixed effects and random effects.
In-depth introduction to the multilevel model of change.
The latent growth model.
Event history analysis (survival analysis).
We will cover both discrete time events as well as Cox models.
NB This course is delivered over 4.5 days and will finish no later than 5pm on Friday. The longer duration of this course, compared to other methods@manchester Summer School courses, is reflected in the price.
This course will be presented by Dr Alexandru Cernat.
Dr Alexandru Cernat is a lecturer in Social Statistics at The University of Manchester. Previously, he was Research Associate at the National Centre for Research Methods. He has been awarded a PhD in survey methodology from the University of Essex where he has investigated data quality in longitudinal studies. His research interests cover latent variable modelling, measurement error, missing data, survey methodology, methods for longitudinal data collection and analysis.
- Knowledge of linear regression analysis.
- No prior knowledge of R is required.
- Singer, J., & Willett, J. (2003). Applied longitudinal data analysis: modeling change and event occurrence. Oxford University Press.
- Newsom, J. T. (2015). Longitudinal Structural Equation Modeling: A Comprehensive Introduction. Routledge.
- Long, J. D. (2011). Longitudinal Data Analysis for the Behavioral Sciences Using R. Thousand Oaks, Calif: SAGE Publications, Inc.
- Wickham, H., & Grolemund, G. (2017). R for Data Science. O’Reilly UK Ltd. (available free online: http://r4ds.had.co.nz/)