English Corpus Linguistics - LAN00032H
Module summary
Corpus linguistics uses real world data for the analysis of language. In this module you'll learn about how we can use a corpus (body of texts) to discover patterns of usage in English and investigate variation and change. Assuming no prior knowledge, we'll teach the valuable skills you need to use standard text analysis software, check your own ideas and discover patterns yourself.
Related modules
Additional information
With respect to pre-requisites the following modules are equivalent.
First year modules
-
Introduction to Syntax, Morphology and Syntax, and Syntactic Structures
Module will run
Occurrence | Teaching period |
---|---|
A | Semester 1 2025-26 |
Module aims
The aim of this module is to introduce you to corpus linguistics, and the use of corpora in studying English language. The first half of the spring term will introduce the theory and practice of corpus linguistics, and the second half will explore how corpora are currently used in linguistic research on English. The summer term is devoted to workshops related to individual project work. This module is largely practical and skills driven and the assessment is designed to test your skills in accessing the primary literature, data collection and analysis, descriptive adequacy, critical thinking, argumentation, and written presentation skills.
Module learning outcomes
On completion of this module a student should be able to:
- understand and discuss the main issues and methodologies of corpus linguistics
- understand the role of corpus data in linguistic research
- carry out linguistic investigations using a variety of corpora and corpus methodologies
- perform simple statistical tests
A student will develop competence in the following skills:
- recognising and explaining complex patterns in linguistic data
- forming valid generalisations about language from corpus data
- expressing grammatical concepts clearly and concisely
- designing and carrying out a small research project using corpus data
- summarising and presenting findings in a style appropriate to the norms of the discipline
- understanding and applying basic statistical concepts relevant to linguistic analysis
Employability skills:
- This module will allow a student to particularly develop skills in the application of IT/numeracy skills.
- to extract linguistic data from large electronic corpora
- to organize, manipulate, analyse and quantify this data electronically in order to investigate questions about real-world language use.
- to learn some simple statistics in order to validate your results.
Module content
This module introduces the concept of the corpus as an authentic language sample. We consider how corpora are designed and put together and how information about linguistic structure (annotation) is added to a corpus. Students develop knowledge of the questions that corpus linguistics helps us to answer, and as part of this we deepen our understanding of variables and hypotheses, including how to test them.
Indicative assessment
Task | % of module mark |
---|---|
Essay/coursework | 100 |
Special assessment rules
None
Indicative reassessment
Task | % of module mark |
---|---|
Essay/coursework | 100 |
Module feedback
Feedback for the first formative is given by the end of week 7. For the second formative it is given by the end of week 10. For the third formative it is given by the end of week 11. Feedback for the summative is given within the University mandated turnaround time.
Indicative reading
Biber, D. (1993). Representativeness in Corpus Design. Literary and Linguistic Computing, 8(4).
Gries, S. Th. (2009). What is Corpus Linguistics? Language and Linguistics Compass, 3(5), 1225–1241. https://doi.org/10.1111/j.1749-818X.2009.00149.x
McEnery, T., & Hardie, A. (2011). Corpus Linguistics. Cambridge: Cambridge University Press. doi:10.1017/CBO9780511981395.002
Taylor, C. 2007. What is Corpus Linguistics? What the data says. ICAME Journal 32, 179-200.