zipfR: Brown – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Brown

Brown Corpus Frequency Data (zipfR)

Brown.tfl, Brown.spc and Brown.emp.vgc are zipfR objects of classes tfl, spc and vgc, respectively.

These data were extracted from the Brown corpus (see Kucera and Francis 1967).

Brown.emp.vgc is the empirical vocabulary growth curve, reflecting the V and V(1) development in the non-randomized corpus.

We removed numbers and other forms of non-linguistic material before collecting word counts from the Brown.

Kucera, H. and Francis, W.N. (1967). Computational analysis of present-day American English. Brown University Press, Providence.

data(Brown.tfl)
  summary(Brown.tfl)

  data(Brown.spc)
  summary(Brown.spc)

  data(Brown.emp.vgc)
  summary(Brown.emp.vgc)

Statistical Models for Word Frequency Distributions

v0.6-70

GPL-3

Authors

Stefan Evert <stefan.evert@fau.de>, Marco Baroni <marco.baroni@unitn.it>

Initial release

2020-10-10