The corpus and Oxford Dictionaries
Language research based on real evidence
Oxford Dictionaries are continually monitoring and researching how language is evolving. Corpus analysis is central to this process, and provides real evidence on which to base our language research.
What is a corpus?
A corpus is a collection of texts of written (or spoken) language presented in electronic form. It provides evidence of how language is used in real situations, which allows our editors to write accurate and meaningful entries.
Read more about what makes a corpus.
The Oxford English Corpus and Oxford New Words Corpus ensure that we can track and record the very latest developments in language today. By analysing these corpora and using special software, we can see words in context and find out how new words and senses are emerging, as well as spotting other trends in usage, spelling, world English, and so on.
Big box in the Corpus
Big' and 'box' are both everyday words – but only in the recent past has 'big box' come to greater prominence. Find out how our corpus helped us discover this.
Cause in the Corpus
How has ‘cause’ changed over time? The corpus helped us investigate how use of ‘cause’ has become more negative – and influenced our altered definition.
Edgy in the Corpus
How did the Oxford English Corpus help us add a new sense to our entry for 'edgy'?
Patterns of word formation
What patterns can be spotted for the new words being added to English? We take a look at four very prolific suffixes and the words they’ve given rise to.
Some day or someday?
Someday or some day? A number of common English words started life as two-word phrases – but has ‘some day’ made the jump to ‘someday’ yet?
Technical information about the corpus
How does the Oxford English Corpus work? Here is the technical information which explains how we go about gathering and using the data in the corpus.
The OEC: Composition and structure
The Oxford English Corpus is based mainly on material collected from pages on the World Wide Web: this page explains how it is composed and structured.
The Oxford English Corpus
What is the Oxford English Corpus? We give an introduction to how the corpus is built, and the countries from which it gathers varieties of English.
The Oxford New Words Corpus (New Monitor Corpus)
What is the Oxford New Words Corpus? Also known as the New Monitor Corpus, this corpus has gathered words since 2012 and stands at over 7 billion words.
How did the orangutan get its name? What about the aardvark or the squirrel? We delve into the jungle of animal etymologies.
We take a look at several popular, though confusing, punctuation marks.