Data of interest to academic research

Despite the relatively large number of words that constitutes the lexicon of a language, most spoken and written discourse is composed out of a relatively small set of words and their repetitions. Extensive corpus analyses indicate that, in English, the 2,000 most frequent words account for 80-85% of the running words in non-specialized written texts and about 90-95% in colloquial speech (informal spoken language).

The identification of these subsets of the lexicon is relevant to various research endeavors as well as to central aspects of curriculum and course design. Of special interest are the General Service List (GSL) and the Academic Word List (AWL) that, together, provide approximately 90% coverage of most written texts.

General Service List

The General Service List (GSL) by West (A General Service List of English Words, 1953, London; Longman, Green & Co.) is a well-known list that has withstood the test of time and comprises around 2, 000 word families from among the most frequent in the English language.

Download the GSL set

Size 82.1 KB
Files Included: Headwords Family words Word families by headword Minimal pairs Headwords by part of speech

Academic Word List

The Academic Word List (AWL) by Coxhead (A New Academic Word List. TESOL Quarterly, 34(2), 2000: 213-238) contains some 570 word families specific to academic texts where they account for about 9-10% of running words.

Download the AWL set

Size 30.7 KB
Files Included: Headwords Family words Word families by headword Headwords by part of speech

ICE-CORE word lists

The ICE-CORE lists by Gilner and Morales (The ICE-CORE word list: The lexical foundation of 7 varieties of English. Asian Englishes, 14(1), 2011: 4-21) are compilations of the most frequent lemmas (inflections) and headwords (word families) shared across 7 varieties of English, namely, Canada, East Africa, Hong Kong, India, Jamaica, Philippines, and Singapore and as represented by corpora in the International Corpus of English.

Download the ICE-CORE v2.1 list

Size 9.8 KB
Files Included: Lemmas Headwords

Function word lists

Function words are characterized by their ambiguous lexical meaning and by their capacity to organize grammatical relationships between words within a sentence. There are a relatively small and fixed number of function words (as opposed to verbs, nouns, adjectives, and adverbs, which are limited but expandable sets). Prepositions, conjunctions, determiners, pronouns, and auxiliary verbs are all considered function words. Most of these words are uninflected although a few are inflected and may take affixes.

Ideally, it would be possible to list all function words (since they comprise a closed class) but this is a surprisingly difficult thing to do. Nonetheless, our objective is to provide exhaustive lists of the function words in the English language. Contributions are welcome.

Download the English function words set

Size 4.08 KB
Files Included: Auxiliary Verbs Conjunctions Determiners Prepositions Pronouns Quantifiers

For reference purposes, a succinct description of each class of function words follows.