Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

sample_tfl

Incremental Samples from a Type Frequency List (zipfR)


Description

Compute incremental random samples from a type frequency list (an object of class tfl).

Usage

sample.tfl(obj, N, force.list=FALSE)

Arguments

obj

an object of class tfl, representing a type frequency list

N

a vector of non-negative integers in increasing order, the sample sizes for which incremental samples will be generated

force.list

if TRUE, the return value will always be a list of tfl objects, even if N is just a single integer

Details

The current implementation is reasonably efficient, but will be rather slow when applied to very large type frequency lists.

Value

If N is a single integer (and the force.list flag is not set), a tfl object representing a random sample of size N from the type frequency list obj.

If N is a vector of length greater one, or if force.list=TRUE, a list of tfl objects representing incremental random samples of the specified sizes N. Incremental means that each sample is a superset of the preceding sample.

See Also

tfl for more information about type frequency lists

sample.spc is an analogous function for frequency spectra (objects of class spc)

Examples

## load Brown tfl
data(Brown.tfl)
summary(Brown.tfl)

## sample a tfl of 100k tokens
MiniBrown.tfl <- sample.tfl(Brown.tfl,1e+5)
summary(MiniBrown.tfl)

## if we repat, we get a different sample
MiniBrown.tfl <- sample.tfl(Brown.tfl,1e+5)
summary(MiniBrown.tfl)

zipfR

Statistical Models for Word Frequency Distributions

v0.6-70
GPL-3
Authors
Stefan Evert <stefan.evert@fau.de>, Marco Baroni <marco.baroni@unitn.it>
Initial release
2020-10-10

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.