The corpus and Oxford Dictionaries

Language research based on real evidence

Oxford Dictionaries are continually monitoring and researching how language is evolving. Corpus analysis is central to this process, and provides real evidence on which to base our language research.

What is a corpus?

A corpus is a collection of texts of written (or spoken) language presented in electronic form. It provides evidence of how language is used in real situations, which allows our editors to write accurate and meaningful entries.

Read more about what makes a corpus.

The Oxford English Corpus and Oxford New Words Corpus ensure that we can track and record the very latest developments in language today. By analysing these corpora and using special software, we can see words in context and find out how new words and senses are emerging, as well as spotting other trends in usage, spelling, world English, and so on.

Eccentric room thumbnail

All in a word: eccentric or quirky?

Eccentric and quirky are similar, but our corpus shows that they are used very differently.

Big box store thumbnail

Big box in the Corpus

Big' and 'box' are both everyday words – but only in the recent past has 'big box' come to greater prominence. Find out how our corpus helped us discover this.

Downward graph thumbnail

Cause in the Corpus

How has ‘cause’ changed over time? The corpus helped us investigate how use of ‘cause’ has become more negative – and influenced our altered definition.

Edge building thumbnail

Edgy in the Corpus

How did the Oxford English Corpus help us add a new sense to our entry for 'edgy'?

Bunting thumbnail

Patterns of word formation

What patterns can be spotted for the new words being added to English? We take a look at four very prolific suffixes and the words they’ve given rise to.

Sunset forest thumbnail

Some day or someday?

Someday or some day? A number of common English words started life as two-word phrases – but has ‘some day’ made the jump to ‘someday’ yet?

Technical drawing thumbnail

Technical information about the corpus

How does the Oxford English Corpus work? Here is the technical information which explains how we go about gathering and using the data in the corpus.

Power tower thumbnail

The OEC: Composition and structure

The Oxford English Corpus is based mainly on material collected from pages on the World Wide Web: this page explains how it is composed and structured.

Keystrokes thumbnail

The Oxford English Corpus

What is the Oxford English Corpus? We give an introduction to how the corpus is built, and the countries from which it gathers varieties of English.

Magnifying glass and book thumbnail

The Oxford New Words Corpus (New Monitor Corpus)

What is the Oxford New Words Corpus? Also known as the New Monitor Corpus, this corpus has gathered words since 2012 and stands at over 7 billion words.