http://languagelog.ldc.upenn.edu/nll/?p=2
In "A quantitative history of which-hunting", I reproduced a plot due to (an anonymous colleague of) Jonathan Owen, showing that texts from the last half of the 20th century saw a decrease in the relative frequency of NOUN which VERB, and an increase in the relative frequency of NOUN that VERB. Jonathan took this to indicate the success of (usage guides like) Strunk & White's The Elements of Style in persuading writers and copy-editors to avoid which in "restrictive" (AKA "defining" or "integrated") relative clauses
Here are some plots showing the effect, for data (without smoothing) from the Google Books ngram corpus. The "British English" dataset shows about the same increase in NOUN that as the "American English" collection does, but somewhat less decrease in NOUN which:
| American English | British English |
![]() |
![]() |
Note that I have NOT simply plotted the frequency of the forms, but rather the proportional change over the course of the century, relative to the mean during that period. Thus if WHICH is the vector of frequency values for the pattern NOUN which from 1900 to 2000, the red curves represent \(WHICH/mean(WHICH)\).
I should also note that both patterns will cover some things that are not relative clauses at all — complement-clauses with that (e.g. "the way that", "the idea that", etc.), and question-word uses of which (e.g. "ask John which he prefers", "asked in a quiet voice which road to take"). But it's striking that Strunk & White's publication date of 1959 corresponds so neatly to an apparent inflection point in the plots.
However, if we look at the trends for more specific patterns, the simple which-hunting story is not so clear. At least, it interacts with other trends that may obscure or overwhelm it in particular cases. For example, we find some evidence for an overall decline in the frequency of (some types of) relative clauses, at least from 1900 to 1970 or 1980.
The plots below show the proportional changes in four sets (also merging upper and lower case):
| the thing that the things that |
blue, solid line |
| the thing which the things which |
blue, dashed line |
| the man that the men that the woman that the women that |
red, solid line |
| the man who the men who the woman who the women who |
red, dashed line |
| Google Ngrams "English" | COHA |
![]() |
![]() |
And if we look across a range of subject pronouns in a similar set of relative clause structures, we see some other effects as well. For some of the pronouns (I, you, we, she), the decline in relative-clause frequencies sharply reverses about 1965, while others (they, he) level off similarly to the overall pattern shown above. At least, that's what happens for relative clauses with that and those that start with the bare pronoun:
| "the things PRO" | "the things that PRO |
![]() |
![]() |
The same structures with which just keep on declining:
This all seems to mean that at least the following things are going on:
- An overall 20th-century decline in relative-clause frequency, probably correlated with declining sentence length and complexity (see e.g. "Real trends in word and sentence length", 10/31/2011; "Inaugural embedding", 9/9/2005; "The evolution of disornamentation", 2/21/2005.
- A change since the 1960s (in overall writing style, or in the Google Books ngram corpus sample, or both) towards increased use of first- and second-person pronouns.
- A change since the 1960s (in overall writing style, or in the Google Books ngram corpus sample, or both) towards increased discussion of women.
- Which-hunting — which may have started before 1959, due to the influence of the Fowlers, but seems to have been strongly boosted by The Elements of Style.
I'm sure that further investigation would uncover additional complexities.
Note: If you still think that E.B. White's conversion to the which-hunting faith was a recognition of the Truth, see Geoff Pullum's essay "A Rule Which Will Live In infamy", Lingua Franca 12/7/2012.






