Big Behavioural Data - PSY00123M
- Department: Psychology
- Credit value: 20 credits
- Credit level: M
- Academic year of delivery: 2025-26
Module summary
When an individual uses social media, plays a video game, or makes an internet search, a digital trace of this interaction is created.
Some of this data is locked away by gatekeeper corporations and guarded tightly in proprietary data lakes and warehouses. However, other data remain accessible via processes of open-source intelligence (OSINT) and data donation. Such ‘big’ data can be used to generate transformative knowledge in both academic and policy contexts.
This module aims to develop skills in understanding and utilising this large-scale secondary human data. This includes employing both traditional processes of statistical inference and more recent AI-powered analytic techniques such as in-context learning.
Module will run
Occurrence | Teaching period |
---|---|
A | Semester 2 2025-26 |
Module aims
Students will learn how to work with a variety of kinds of structured and unstructured large-scale human data drawn from digital environments.
They will learn how to wrangle large and complex human datasets in ways which make them tractable for useful analysis. Students will gain experience of using both traditional statistical modelling techniques; and the application of recent developments in generative AI to obtain insights from large-scale data.
The module will culminate in an open assessment in which they will use a real big human data set to answer a real research question.
Module learning outcomes
- Critically evaluate the use of large-scale behavioural datasets when addressing a research question.
- Select and synthesise large-scale behavioural datasets in order to address a research question.
- Manipulate large human datasets to create novel data features that are relevant to a question of interest.
- Determine an analytic strategy to answer a research question using big human data
- Evaluate how successful an analytical approach has been in addressing a research question using big human data.
- Use relevant statistical and/or AI techniques, and the R programming language to analyse large-scale secondary human data.
- Identify and mitigate the impact of bias and error in the analysis of large human datasets.
Module content
- Data Transformation and AI: Techniques for cleaning, transforming, and structuring large datasets to make them suitable for analysis. Reliably transforming unstructured / inappropriate data using AI. Data quality issues at scale.
- Big Data Infrastructure: Overview of the technologies and platforms that enable the capture of big data, including open-source intelligence and data donation.
- Modelling Big Data: Introduction to various models and algorithms that are particularly useful in the context of large datasets.
- Practical Application of R for Big Data: Hands-on training in using the R programming language, along with relevant packages, for manipulation, visualization, and analysis of large-scale human data.
Indicative assessment
Task | % of module mark |
---|---|
Online Exam -less than 24hrs (Centrally scheduled) | 100 |
Special assessment rules
None
Indicative reassessment
Task | % of module mark |
---|---|
Online Exam -less than 24hrs (Centrally scheduled) | 100 |
Module feedback
Marks will be available on e:vision.
Indicative reading
Wickham, H., Çetinkaya-Rundel, M., & Grolemund, G. (2023). R for data science. " O'Reilly Media, Inc.".
El-Nasr, M. S., Nguyen, T. H. D., Canossa, A., & Drachen, A. (2021). Game data science. Oxford University Press.