The Standard Corpus

Gerald Ferguson

60 page reproduction of Gerald Ferguson's "The Standard Corpus of Present Day English Language Usage arranged by word length and alphabetized within word length." Description of the book by Gerald Ferguson below:

The idea for a dictionary arranged by word length grew out of a series of pages of single letters of the alphabet that I typed as graphics in 1968, and which represented efforts to extend ideas, then current, relating to modular composition, objectively determined forms and the material status of printed letters.

The composing of such a work by hand was much too impractical and I therefore investigated computer sources and came upon the Brown University Million Word Corpus. This Corpus consists of 1,014,232 words or tokens, obtained by taking

five hundred, 2,000 word samples in each of the 15 subject categories listed below (the number to the right of the category indicates the number of 2,000 word samples in that area of written language):

Press: Reportage 44

Press: Editorial 27

Press: Reviews 17

Religion: 17

Skills and Hobbies: 36

Popular Lore: 48

Belles Lettres, Biography, etc.: 75

Miscellaneous: 30

Learned and Scientific Writing: 80

Fiction: General 29

Fiction: Mystery and Detective 24

Fiction: Science 6

Fiction: Romance and Love Story 29

Humor: 9

After collation for coincidence of appearance, the material obtained was limited to lengths 1 through 20 and those few types with computational symbols were edited out, leaving just under 50,000 words. I typed the entire work on stencils and finally ran off a bound first edition of 300 copies in the summer of 1970, using as a title "The Standard Corpus of present Day English Language Usage, arranged by word length and alphabetized within word length". "The Standard Corpus of Present Day English Language Usage" was Brown University's study title. I added "arranged by word length and alphabetized within word length". In the original edition various human and machine-generated inaccuracies were welcomed, as were structural variations resulting from the self-imposed rule not to violate the right hand side margin of each page. The same structural principle has been retained here.

Once completed, the Corpus became a source for other works. Among those was "Equivalents", 1971 (inside front cover) and the choral reading for 26 voices (1972), with the Corpus serving as a score. Included on the inside back cover of this second edition is a photograph of the reading performed at The Art Gallery of Ontario in November, 1977.

-Gerald Ferguson, 1978