What is a corpus?
A corpus is a large collection of text that can be accessed using a computer. A corpus usually represents certain language areas or registers. A corpus could be composed of millions of words from American newspapers, or it could be a collection of academic texts related to biology.
Why is a corpus useful for language teachers?
We teachers often rely on intuition when it comes to language. When a student asks which word or structure is used most commonly, we default to our own experience, language variety, or best guess. But this can often be quite misleading. That's where a corpus comes in. Using technology to access large samples of text within a specific category allows us to confirm our intuitions. A student might ask which word most commonly follows the verb "get" in spoken English, or if using the active or passive voice is more common in academic writing, or if the words "big" and "large" are truly synonymous. With a corpus, you can give them an answer based on data.
Getting started
A good place to get started with using corpora is the Corpus of Contemporary American English (COCA). It can be accessed here: http://corpus.byu.edu/coca/
Try it out and see what you find. Experiment by searching for different words and collocations that could aid you in preparing materials for your students. The short video below will help you get started.