For many humanities scholars, digital approaches continue to seem antithetical to the nuanced close reading of individual texts that characterises the discipline. Literary scholars have expressed particular anxieties about what is lost when digital approaches are applied to textual analysis; they have worried – not without justification – that when we focus on the corpus rather than the individual text, we lose fundamental understandings of how literature works and, perhaps more seriously, how it feels. A danger with using a technology like Geographical Information Systems (GIS) to read the spatial data provided by written texts is that it risks reducing the significance of the individual text in preference for the large corpora, or that it privileges distant reading over close. What this paper will explore, however, is how a geohumanist approach to GIS might be used productively to negotiate a multi-scalar analysis of literary data. It offers an overview of the challenges faced in using GIS for literary study, and explains how the ‘Geospatial Innovation’ project has tried to address some of these issues.
“You’ve got much more media now. On your iPhone, on your Facebook you know”: The Future of using Oral History to understand the Cold War in Britain.
This paper seeks to explore the future of using oral history to understand the Cold War in Britain in the present day. It will explore a vast array of topics within the digital humanities, considering how modern technological developments such as the internet, social media, and video gaming (Virtual Reality (VR) in particular) will and have changed how we understand and consider the past when listening to people and their memories. The paper shall also consider how technology has both advanced and hindered the use of oral history (such as privacy anxieties and computing advancements). Because the Cold War is still the recent past, many people who lived through it now also use many new technologies which have altered and shifted their memory recall, and the way they present information and knowledge in an oral history interview.
Thus, this paper shall argue that when we conduct and use oral history interviews, historians must consider the new channels of knowledge that exist in the 21st century and the ways in which these inform and influence the interview. Through Google, participants are able to ‘fact check’ themselves; through VR, participants are able to visit places and experience the Cold War even though they weren’t physically present; and through new privacy anxieties, the processes and ethics of oral history must change to ensure the protection and confidentiality of participants. In short, this paper shall cover a broad scope of topics to discuss the future of oral history interviewing in the present day, using the Cold War as a historical case study.
What technologies should we consider when utilising oral history as a methodology? What is the future of oral history in 2018? How can we understand perceptions and memories of the British Cold War in the present day, and what digital cultures have influenced these?
Tens of thousands of literary works are available in digital formats on public domain websites such as Project Gutenberg. By using linguistic tagging software, every word of a text can be marked for items such as part-of-speech and semantic meaning. This metadata can be useful in the quantitative analysis of literary texts to identify elements such as genre and register, frequent keywords and sentiment analysis. The systematic identification of language and stylistic features in such a collection can provide for a richer analysis of linguistic and stylistic patterns in heritage works. Additionally, marked literary texts can be used in various disciplines such as historical linguistics and second-language learning.
This paper will explore a methodological approach to creating a literary heritage corpus of prose fiction. To illustrate the practical application of corpus-based analysis, a 9.7 million-word corpus of 95 English novels published between 1911 and 1928 will be presented. Three linguistic tagging software packages will be discussed to explain how the descriptive metadata output can be used in investigating variability amongst a collection of heritage literary works.
Natalie Hall - Social media as more than a source of big data: Participant-based ethnography on Facebook
The internet, and social media in particular, is increasingly being utilised in social science research. Many studies have used publicly available Twitter data, due to relative ease of data collection and freedom from ethical constraints. However, a focus on Twitter is problematic given its small and unrepresentative user base, in terms of age, political leanings and social class, compared to more popular platforms like Facebook. Moreover, social scientists have recognised the embeddedness of web-based technology in daily life, leading to the need to treat these technologies not simply as mediums for data collection, but as objects of research themselves. Content analysis of tweets, for example, not only fails to capture social media use more broadly, but answers only the question of ‘what’ and not ‘why’. Thus my research takes a more in-depth qualitative approach to answer the question of how Brexiteers in northwest England use Facebook and other forms of social media to consume, produce and circulate political information, and how this interacts with their offline views and their broader social world. This presentation will outline the original methodology I developed to this end, as well as discuss the myriad of practical and ethical issues that arose throughout this process.
The digital world has been a wilderness in regard to ethical methodologies, including studies of the dating app Grindr. Often studies of Grindr recruit through the app itself or use profiles as a source of analysis. Addressing LGBT+ identity within the context of smartphone technologies presents unique challenges around outing, anonymity, and consent, particularly to those in homophobic regions.
Grindr’s geolocative features make it a unique tool for tourists to interact with local LGBT+ people and spaces. The project discussed examines how Grindr reconfigures practices of space specifically within tourist-local interactions in Tel Aviv. It employs a multi-method qualitative sociological approach. 20 self-selected tourists and locals in Tel Aviv were interviewed. Prior to the interview, some chose to complete audio diaries recording their daily Grindr practices. Participants were recruited using snowball sampling with multiple entry points: online in public forums, email, and via posters around Tel Aviv.
This work addresses the ethical and methodological challenges of studying Grindr, especially in Tel Aviv. What is limited or lost when using “conservative” methods to study dating app technologies? What is gained? The ongoing investigation speaks to Grindr’s potential as a digital fieldwork site with alternative boundaries and regimes, but also alternative possibilities.
Political astroturfing, i.e. hidden propaganda on social media, has the potential to influence electoral outcomes and other forms of political behavior. A prime example constitutes the alleged interference of Russia in the 2016 U.S. presidential election. A common tool in such campaigns are automated accounts (bots) that flood platforms either in an attempt to boost the popularity of candidates or to discredit others. Although bots are becoming increasingly sophisticated, they are still an easy target for carefully devised detection algorithms. An often neglected factor, though, is the human component in these campaigns. That is, paid agents pretending to be ordinary citizens yet posting in favor of an astroturfing campaign. Common methods for bot detection mostly fail to uncover these accounts due to their apparent human-like nature. In this talk, I present some methodological advancements in detecting such accounts. The tools are used to analyze one of the first large scale astroturfing campaigns during the South Korean presidential election in 2012.
Improving data reliability of lengthy online questionnaires through planned missing data designs: An example from personality psychology
Utilising online questionnaires is an efficient and low-cost solution for data collection, but sufficient sample size and accurate instruments are important to producing reliable results. Sufficient sample sizes reduce sampling error and accurate instruments reduce measurement error. In personality psychology, making instruments reliable often translates into lengthy questionnaires; therefore, the problem lies with finding a very large number of respondents who will respond to a very long questionnaire. Long questionnaires can be daunting to participants and may create unreliability in the responses.
Reducing questionnaire length while continuing to measure all variables of interest is the classic application for a planned missing data design. Many researchers tend to consider missing data as a nuisance to be avoided. However, using a planned missing data design reduces the number of items presented to each respondent; this improves response rates and reliability in the data. A properly planned missing data design will meet the statistical assumptions required for advanced statistical analysis.
An example of a 3-form planned missing data design will be presented, from a single study investigating 71 personality traits (measured by 213 items) and 9 job performance outcomes (measured by 40 items). Implications for reverse-engineering existing datasets are also discussed.
Positive deviance is a growing approach in international development that identifies those within a population who are outperforming their peers in some way; e.g. children in low-income families who are well-nourished when those around them are not. Analysing and then disseminating the behaviours and other factors underpinning positive deviance is demonstrably effective in delivering development results. However, positive deviance faces a number of challenges that are restricting its diffusion. In this research, I'm exploring the potential for big data to address the challenges facing positive deviance and evaluate the promise of “big data-based positive deviance”: this would analyze typical sources of big data in developing countries – mobile phone records, social media, satellite imaging, sensor data, etc. – along with other sources of data to identify both positive deviants and the factors underpinning their superior performance.
Big data definitions are often given in terms of words beginning with ‘V’- Google them if you must! A more pragmatic approach would be to say that big data is any data which is too large to conveniently process on your desktop machine. So, how do we stop the talk title being an oxymoron? We cheat - in two different ways (always have a plan B!). Firstly, we will find ways to cut the big data down to size and if that is not enough, we will make our desktop look bigger that it really is.
In the process of doing this we will look at sources of big data and how you might go about collecting it; different storage formats commonly used for structured and unstructured data as well as some specific big data storage formats; big data processing environments and their advantages over a simple desktop and finally a look at some of the available software tools available for processing big data.
The overall aim is to demonstrate that processing big data can in fact be very similar to processing ‘small’ data.
My research utilises a corpus of 1,097,756 posts retrieved from Popheads, a Reddit-based online music community, to analyse the innovation and diffusion of words through digital social networks. However, how can one efficiently identify the words which are diffusing through a network of this magnitude?
This investigation tackles this dilemma by employing methodology pioneered by Grieve et al. (2017). The method involves calculating the daily relative frequencies (normalised per hundred words) of each of the 150,000 unique words in the corpus. The daily frequency of each word is correlated with the passage of time, and the Spearman rank correlation coefficient test is subsequently used to determine if a word has significantly increased in frequency over the lifespan of the network. Using this method, I have identified over twenty words which are successfully diffusing through Popheads including bop (a catchy song), salty (‘rude’), and tea (‘gossip’). The method is also adapted to calculate the speed and the extent to which each word has diffused into the vocabularies of individual community members.
This method is adaptable to any time-stamped dataset and could be used to identify emerging trends more generally when applied in other disciplines.
Netnography research method has been used in the online world and is necessary to understand the social life of the digital age. This term was founded by Robert Kozinets after the revolution of the internet and more generally the digital world. Researching, collecting data and participating in digital worlds creates a chance for social scientists to observe and be a member of the field themselves. This study is about the musician, employer/employees of entertainment industry and audience behaviour in the digital age. It explores the relationships between them in terms of social capital principles such as trust, socialisation and reciprocity. An online survey was distributed to 200 audiences to understand their user habits on social media. I am also a musician with 3,000 social media followers and I have collected further data from my own fan page to compare the results. In this presentation, I reflect on my researcher insider status.
Settings where people have to operate in extreme, demanding, and high-risk environments pose challenges for researchers. Hazardous occupational and recreational environments are not uncommon, and include those experienced by the military, Search and Rescue operators, anti-poaching patrols, deep sea saturation divers, humanitarian response teams, and expeditioners. In these contexts, people are exposed to a combination of environmental, psychological, and interpersonal stressors that are rarely present in mainstream life, including physical danger, inhospitable climates, monotony, lack of privacy, and limited social contact. Studies that examine how people adapt and cope with these stressors are critical for informing evidence-based decisions and understanding how to support the safety, performance and health of personnel. Despite the potential value, researchers have consistently faced obstacles when collecting and analysing psychological data from people in extreme contexts. In this presentation, we will discuss how digital approaches to the collection, analysis and extrapolation of data may help overcome the challenges faced by extreme environment researchers. Examples will be drawn from an existing research programme that applies innovative diary-study methods and contemporary statistical approaches to understand the situational experiences of people exposed to extreme stress.
Introduction. Questionnaire-based assessments are associated with challenges such as recall bias and low response and retention rates. Real-time, mobile-based approaches in behavioural, psychological and social research can present a new way forward. A difficult to assess with traditional assessments area where mobile-based approaches can be valuable are bedtime routines, which are associated with child wellbeing and development.
Methods. An interactive, user-friendly, real time text-survey assessment of bedtime routines was developed and administered to 50 families with preschool age children. The assessment was delivered for 5 consecutive nights and it involved open-ended and closed questions about that night’s routine. Anonymised feedback, response and retention rates and other insight information were collected.
Results. The text survey was perceived positively with an average score of 4.5 out of 5 for overall experience. There was an overall response rate of 87%, much higher than conventional questionnaire-based assessments. Finally, retention rates were good with every participant replying to the text-survey at least 3 out of 5 nights resulting in an average of 40 unique data points per participant throughout the duration of the study. There were no dropouts during the study. Conclusion. The text-survey assessment delivered to participant’s mobile phones was successful for assessing bedtime routines, it was perceived positively, causing minimum disruption.
As digital technologies become increasingly entangled with society and space, digital methods for researching social lives, spatial processes and contemporary cultures are growing in popularity. This includes social media methods, digital ethnography, data science, geographic information science and digital humanities research. Yet, the data generated through digital methods often interfaces with, supplements or complements the critical concerns of well-established non-digital methods such as ethnography or interviews, visual methods and textual analysis. This dialogue between different kinds of data, methods and tools can be at once productive tool for critical research and difficult to navigate into a coherent analysis.
This presentation discusses the capacities and pitfalls of combining digital and non-digital data for social and spatial research. It explores some of the techniques and tools that can be used to cross the digital/non-digital divide. It considers how the interrelationship between theory and method in research design and analysis is of paramount importance, and how experimentation can be a useful tool for charting paths across the divide. Finally, it discusses different strategies for presenting mixed data in research, including textual and visual strategies for explanation.
Sofia Eleftheriadou - 'Measuring students' collaborative problem solving skills: a computer-based approach'
Students’ cognitive and non-cognitive skills are being assessed -formally and informally- every day at local, national and international level. The Programme for International Student Assessment (PISA) survey measures the performance of about half a million 15-year-old students in science, maths and reading from about 70 countries and economies worldwide. In 2015, taking advantage of the rapid developments of digital technologies, PISA developed an assessment of collaborative problem solving (CPS) skills for the first time. Compared to paper-and-pencil tests, this is an online, computer-based assessment in which students face some problem solving scenarios were they have to collaborate with other students. The latter are not human but computer agents who are pre-programmed to simulate human interaction. This study aims to explore how reliable and valid is the PISA CPS assessment, how it is associated with students’ performance in science, maths and reading and how students actually complete this assessment, using a mixed-methods approach.