Loading and Saving Vocabulary Growth Curves (zipfR)
read.vgc
loads vocabulary growth data from .vgc
file
write.vgc
saves vocabulary growth data in .vgc
file
read.vgc(file) write.vgc(vgc, file)
file |
character string specifying the pathname of a disk file.
Files with extension |
vgc |
a vocabulary growth curve, i.e.\ an object of class
|
A TAB-delimited text file with column headers but no row names
(suitable for reading with read.delim
). The file must contain
at least the following two columns:
N
increasing integer vector of sample sizes N
V
corresponding observed vocabulary sizes V(N) or expected vocabulary sizes E[V(N)]
Optionally, columns V1
, ..., V9
can be added to
specify the number of hapaxes (V_1(N)), dis legomena
(V_2(N)), and further spectrum elements up to V_9(N).
It is not necessary to include all 9 columns, but for any V_m(N)
in the data set, all "lower" spectrum elements V_{m'}(N) (for
m' < m) must also be present. For example, it is valid to have
columns V1 V2 V3
, but not V1 V3 V5
or V2 V3 V4
.
Variances for expected vocabulary sizes and spectrum elements can be
given in further columns VV
(for
Var[V(N)]), and VV1
, ...,
VV9
(for Var[V_m(N)]). VV
is mandatory in this case, and columns VVm
must be specified
for exactly the same frequency classes m
as the Vm
above.
These columns may appear in any order in the text file. All other columns will be silently ignored.
If the filename file
ends in the extension .gz
, .bz2
or .xz
,
the disk file will automatically be decompressed (read.vgc
) or compressed (write.vgc
).
read.vgc
returns an object of class vgc
(see the
vgc
manpage for details)
## save Italian ultra- prefix VGC to external text file fname <- tempfile(fileext=".vgc") write.vgc(ItaUltra.emp.vgc, fname) ## now <fname> is a TAB-delimited text file with columns N, V and V1 ## we ready it back in New.vgc <- read.vgc(fname) ## same vgc as ItaUltra.emp.vgc, compare: summary(New.vgc) summary(ItaUltra.emp.vgc) head(New.vgc) head(ItaUltra.emp.vgc) stopifnot(isTRUE(all.equal(New.vgc, ItaUltra.emp.vgc))) # should be identical
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.