Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

Brown

Brown Corpus Frequency Data (zipfR)


Description

Brown.tfl, Brown.spc and Brown.emp.vgc are zipfR objects of classes tfl, spc and vgc, respectively.

These data were extracted from the Brown corpus (see Kucera and Francis 1967).

Details

Brown.emp.vgc is the empirical vocabulary growth curve, reflecting the V and V(1) development in the non-randomized corpus.

We removed numbers and other forms of non-linguistic material before collecting word counts from the Brown.

References

Kucera, H. and Francis, W.N. (1967). Computational analysis of present-day American English. Brown University Press, Providence.

See Also

The datasets documented in BrownSubsets pertain to various subsets of the Brown (e.g., informative prose, adjectives only, etc.)

Examples

data(Brown.tfl)
  summary(Brown.tfl)

  data(Brown.spc)
  summary(Brown.spc)

  data(Brown.emp.vgc)
  summary(Brown.emp.vgc)

zipfR

Statistical Models for Word Frequency Distributions

v0.6-70
GPL-3
Authors
Stefan Evert <stefan.evert@fau.de>, Marco Baroni <marco.baroni@unitn.it>
Initial release
2020-10-10

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.