Centre for Linguistic History and Diversity

The Centre for Linguistic History and Diversity is an international research centre. Our aim is to understand current linguistic diversity, history and pre-history by focusing on:

Diverse combinatoric systems across the world’s languages
Cross-linguistic analysis of morphological and morphosyntactic systems (variation in the grammatical patterns of words)
Empirical and theoretical approaches to language change over time
The mechanisms of variation, particularly in relation to linguistic pre-history
Place-names as a window on linguistic history and pre-history

Current partners beyond York are:

Charles University, Prague
Czech Language institute, Czech Academy of Sciences
Institute of Croatian Language and Linguistics
University of Eastern Finland
University of Iceland
University of Konstanz
University of Pennsylvania
University of Sheffield
University of Tartu
University of Zagreb
The Yiddish Book Centre (Amherst, MA)

The initial core research programme began in January 2014.

Major projects affiliated with the Centre

Feast & Famine: Confronting overabundance and defectivity in language

Co-I: Dunstan Brown

Sep 2020 - Aug 2024

Project page

How do people acquire and make sense of ‘messy’ linguistic data, when there are too many or too few forms available?

Our international team is examining this question from multiple angles using data from the languages of central and eastern Europe. Funded by the UK’s Arts and Humanities Research Council, this project will investigate two puzzling language phenomena as reflected in a variety of linguistic data and how we describe them when writing reference works for public use.

Combining Gender and Classifiers in Natural Language

Co-I: Dunstan Brown

Apr 2013 - May 2016

Project page

An AHRC project on gender and classifiers, involving collaboration with the University of Surrey. Genders and classifiers are two different types of system which do a similar thing, categorize nouns, and it is reasonable to assume that the would be mutually exclusive. If a language has a classifier system, we don't normally expect it to have a gender system, and similarly if it has a gender system, we don't normally expect it to have a classifier system. However, there are a few which have both ('dual categorization'). This project investigates what happens when languages have such dual systems and compares this with those which have only one such system, or none.

Endangered Complexity

Co-I: Dunstan Brown

Mar 2012 - Feb 2015

Project page

A joint AHRC/ESRC project on the Oto-Manguean languages of Mexico involving collaboration with the University of Surrey. There are about 200 of these languages, and many of them are severely threatened or endangered. The Oto-Manguean languages have complex inflectional morphology (system of encoding grammatical information on words). They combine suffixes, prefixes, complex tonal patterns and stem alternations into many different inflectional classes. Understanding how the Oto-Manguean languages work provides important evidence as to the possible limits of inflectional complexity.

From Competing Theories to Fieldwork: The Challenge of an Extreme Agreement System

Co-I: Dunstan Brown

Jan 2012 - Jun 2015

Project page

An AHRC-funded project on the Archi agreement system, involving collaboration with Essex, Harvard, and Surrey. The Nakh-Daghestanian language Archi provides a rich source of data on the interaction between morphology and syntax, particularly in relation to the role of both components in agreement. A wide variety of domains and constructions in Archi manifest agreement. This makes Archi particularly valuable language for investigating the mechanisms and constraints on this important part of the grammatical system.

LanGeLin

PI: Giuseppe Longobardi

RA: Guido Cordoni, Shin-Sook Kim, Dimitris Michelioudakis, Nina Radkevich

Consultant: Giuseppina Silvestri

Dec 2012 - Nov 2018

ERC Advanced Grant Project

Project page

LanGeLin (Language and Gene Lineages) is the acronym for the ERC-funded research project 'Meeting Darwin's last challenge: toward a global tree of human languages and genes', running from December 2012 to November 2018.

Matches and Mismatches in Nominal Morphology and Agreement: Learning from the Acquisition of Eegimaa

Investigators: Dunstan Brown, Serge Sagna, Marilyn Vihman, Virve-Anneli Vihman

Apr 2017 - March 2020

Funded by the Economic and Social Research Council

Project page

Theoretical accounts of the strategies used by children to learn the structures of words and grammatical features of languages differ considerably, but our knowledge of what is possible is limited by the existing focus on a relatively small number of languages associated with industrialised nations. Here, we investigate grammatical features and structures that may be expressed in a variety of different ways. Examples of grammatical features include number (eg the distinction between singular and plural), or gender (eg distinguishing masculine and feminine in languages like French), features expressed within the shape of the word and associated items. Grammatical structure may be manifested in agreement across the separate words of a noun phrase. This project investigates the acquisition of inflectional morphology, ie grammatical features and structures as reflected in the word forms and associated agreement, in Gújjolaay Eegimaa, a language of the Atlantic family of the Niger Congo phylum spoken in Southern Senegal. This language has a gender system of the type traditionally known as a noun class system. Noun class systems with complex gender agreement are characteristic of the Niger-Congo languages.

Morphological Complexity: Typology as a Tool for Delineating Cognitive Organization

Co-I: Dunstan Brown

Feb 2009 - Jan 2015

Project page

This ERC-funded project is a comprehensive typological investigation of morphological complexity and involves collaboration with colleagues at Surrey and Brighton. Work at York focuses on two research strands. The first strand, Discovering Complexity, with Roger Evans (Brighton), concentrates on the machine learning of inflectional classes, where we investigate how much can be learned without building language-specific knowledge into the system. The second strand, with colleagues at Surrey, uses the Network Morphology theoretical framework to investigate defaults and irregularity in morphological systems.

The Oxford Corpus of Old Japanese

Co-I: Peter Sells

2012-2015

British Academy Research Project

Link to the corpus

The Oxford Corpus of Old Japanese (OCOJ) project developed a comprehensive annotated digital corpus of all extant texts, with an associated dictionary and translations, from the Old Japanese period. This is the earliest attested stage of Japanese, from the Asuka and Nara periods of Japanese history (7-8th centuries AD), and the formative literate period of Japan. These texts are therefore of paramount importance for the study and understanding of the origins and development of civilization in Japan, including language, writing, literature, religion, history, and culture. The corpus is now maintained and hosted by NINJAL in Japan, and is designed to support research in any of these areas.