Describing the archive: Identifying offensive language
Event details
Institute for the Public Understanding of the Past Lecture
*This event will take place in person in The Treehouse at the Humanities Research Centre, Covid-19 conditions permitting. It will also be live-streamed via Zoom. Joining information for Zoom will be circulated shortly before the event.
This talk will present the results from a three-month research project aiming to develop methods to automatically identify bias and offensive language in legacy archive descriptions. Anecdotal evidence suggests that offensive language and bias exist in catalogue descriptions. In light of conversations taking place within the heritage sector on appropriate language in catalogue descriptions, the project has built a proof-of-concept prototype methodology from work conducted upon descriptions from the Brotherton Special Collections in Leeds. Using corpus analysis of sets of legacy descriptions, we have developed computational methods that will aid the discovery of bias and offensive language for revision purposes.
Along with presenting the methods developed by the project, we will outline broader discussions on descriptions that have informed our work. We will also present findings from a data collection exercise conducted during the project that sought to gather experiences with legacy descriptions from archive professionals and researchers. We will outline how the methods developed in Leeds could have wider application in revising legacy descriptions from the National Archives and other major collections.
__________________
This talk is part of the spring seminar programme of the Institute for the Public Understanding of the Past. All are welcome. Talks will last 45-50 minutes, followed by a Q&A with the speakers.
About the speakers
Dr. Vic Clarke is a Lecturer in Modern British History at the University of York, working on 19th century political movements. She gained her PhD from the University of Leeds in 2020, exploring the role of the Northern Star newspaper in community-building and communication in the Chartist movement, including a corpus linguistics analysis of the ‘Readers and Correspondents’ column. She was previously a postdoctoral fellow at the Leeds Arts and Humanities Research Institute (LAHRI).
Dr. Kevin Matthew Jones is a Research Fellow at the National Archives, developing methods to digitally represent nationwide archive statistics c.2007 - 2020. His research interests are in the history of medical data, specialising in mental health and the use of psychiatric classification in public health reports and medical records. A monograph based upon Kevin’s PhD research, Counting and Classification: Psychiatric Nosology and Diagnosis 1845 - 1960 will be published with Palgrave MacMillan in January 2023.
During their LAHRI fellowships, Kevin and Vic worked on a National Archives funded ‘TestBed’ project on finding problematic and offensive language in legacy descriptions in the Brotherton Special Collections archives. It is this work they will be presenting to IPUP.