Corpus Linguistics and the Cline of Close and Distant Reading of Literary Texts
Michaela Mahlberg, University of Birmingham, UK
Since the early days of corpus linguistics, developments in computing power and the increasing availability of data have contributed to pushing the boundaries of corpus linguistics. Corpus linguistic methods are now increasingly used across a range of disciplines and also often subsumed under the heading of the Digital Humanities. In my talk, I will use the study of literary texts (especially 19th century novels) as an example to show how corpus linguistics can create links across disciplines. I will also argue that such links should encourage us to (re-)consider the contribution that corpus linguistics can make to the study of language. In particular, the increasing popularity of the term ‘distant reading’ is a good starting-point for such a critical discussion. While there are similarities between corpus linguistics and the kind of distant reading advocated by literary scholars, corpus linguistic methods also make it possible to approach the analysis of texts from a more qualitative point of view enabling links to methods of close reading. I will argue that crucial factors on the cline of close and distant reading include innovative approaches to the exploitation of concordance data (as illustrated by the web app CLiC), the contextualisation of literary texts in their socio-historical contexts (enabled through the study of appropriate reference corpora) and the testing of corpus linguistic claims with complementary methods (such as eye-tracking reading studies).
Corpus insights into community and identity in academic writing
Ken Hyland, The University of Hong Kong
To many outsiders, corpus linguistics is often seen as a dreary quantitative method for sad IT geeks, but I want to argue here that it can contribute to our understanding of two of the most controversial concepts in the social sciences: community and identity. With the emergence of community-oriented views of literacy in recent years, greater attention has been given to the specific contexts of language use, so we have learnt that texts are successful only when they employ conventions that other members of the community find familiar and convincing. Because of this, corpus studies have become invaluable in revealing how language choices help construct both arguments and disciplines. Moreover, because writers negotiate representations of themselves through the discourses of their communities, corpus studies also contribute to a new way of conceptualising identity. Essentially, the study of academic discourse shows how we choose our words to connect with others and present ideas in ways that make most sense to them. By privileging certain ways of making meanings, repeated uses of language help to perpetuate the norms and thinking of disciplinary communities and so encourage the performance of certain kinds of professional identities. Communities thus constrain identity choices but they also indicate the ways we relate independent beliefs to shared experience. In this way, the production of texts is always the production of community and of self.
Using Big Data to Map Language Structure and Use
Jack Grieve, Aston University
In this paper, I present an analysis of regional lexical variation in an 8.9 billion word corpus of geocoded American Tweets collected between 2013 and 2014. I first map the relative frequencies of the top 10,000 most frequent words in the corpus across 3,000+ American counties. This analysis reveals that the frequencies of most words exhibit regional patterns in this variety of language. I then identify common patterns of regional variation in both function words and content words through a multivariate spatial analysis. In addition to identifying the main patterns of regional lexical variation on Twitter, this analysis shows that grammatical variation and topical variation are closely aligned—a result that challenges standard assumptions about the causes of dialect variation.
Follow us on Twitter: @apling_iastate
Tweet about the conference and connect with other attendees by using the hashtag: #AACL2016