English Corpus Linguistics - LAN00032H

Department: Language and Linguistic Science
Credit value: 20 credits
Credit level: H
Academic year of delivery: 2025-26
- See module specification for other years: 2023-24 2024-25 2026-27

Module summary

Corpus linguistics uses real world data for the analysis of language. In this module you'll learn about how we can use a corpus (body of texts) to discover patterns of usage in English and investigate variation and change. Assuming no prior knowledge, we'll teach the valuable skills you need to use standard text analysis software, check your own ideas and discover patterns yourself.

Related modules

Pre-requisite modules

Syntactic Structures (LAN00011C)

Additional information

With respect to pre-requisites the following modules are equivalent.

First year modules

Introduction to Syntax, Morphology and Syntax, and Syntactic Structures

Module will run

Occurrence	Teaching period
A	Semester 1 2025-26

Module aims

The aim of this module is to introduce you to corpus linguistics, and the use of corpora in studying English language. The first half of the spring term will introduce the theory and practice of corpus linguistics, and the second half will explore how corpora are currently used in linguistic research on English. The summer term is devoted to workshops related to individual project work. This module is largely practical and skills driven and the assessment is designed to test your skills in accessing the primary literature, data collection and analysis, descriptive adequacy, critical thinking, argumentation, and written presentation skills.

Module learning outcomes

On completion of this module a student should be able to:

understand and discuss the main issues and methodologies of corpus linguistics
understand the role of corpus data in linguistic research
carry out linguistic investigations using a variety of corpora and corpus methodologies
perform simple statistical tests

A student will develop competence in the following skills:

recognising and explaining complex patterns in linguistic data
forming valid generalisations about language from corpus data
expressing grammatical concepts clearly and concisely
designing and carrying out a small research project using corpus data
summarising and presenting findings in a style appropriate to the norms of the discipline
understanding and applying basic statistical concepts relevant to linguistic analysis

Employability skills:

This module will allow a student to particularly develop skills in the application of IT/numeracy skills.
- to extract linguistic data from large electronic corpora
- to organize, manipulate, analyse and quantify this data electronically in order to investigate questions about real-world language use.
- to learn some simple statistics in order to validate your results.

Module content

This module introduces the concept of the corpus as an authentic language sample. We consider how corpora are designed and put together and how information about linguistic structure (annotation) is added to a corpus. Students develop knowledge of the questions that corpus linguistics helps us to answer, and as part of this we deepen our understanding of variables and hypotheses, including how to test them.

Indicative assessment

Task	% of module mark
Essay/coursework	100.0

Special assessment rules

None

Indicative reassessment

Task	% of module mark
Essay/coursework	100.0

Module feedback

Feedback for the first formative is given by the end of week 7. For the second formative it is given by the end of week 10. For the third formative it is given by the end of week 11. Feedback for the summative is given within the University mandated turnaround time.

Indicative reading

Biber, D. (1993). Representativeness in Corpus Design. Literary and Linguistic Computing, 8(4).

Gries, S. Th. (2009). What is Corpus Linguistics? Language and Linguistics Compass, 3(5), 1225–1241. https://doi.org/10.1111/j.1749-818X.2009.00149.x

McEnery, T., & Hardie, A. (2011). Corpus Linguistics. Cambridge: Cambridge University Press. doi:10.1017/CBO9780511981395.002

Taylor, C. 2007. What is Corpus Linguistics? What the data says. ICAME Journal 32, 179-200.