Loading and Saving Type Frequency Lists (zipfR)
read.tfl
loads type frequency list from .tfl
file
write.tfl
saves type frequency list object in .tfl
file
read.tfl(file, encoding=getOption("encoding")) write.tfl(tfl, file, encoding=getOption("encoding"))
file |
character string specifying the pathname of a disk file.
Files with extension |
tfl |
a type frequency list, i.e.\ an object of class |
encoding |
specifies the character encoding of the disk
file to be read or written to. See |
A TAB-delimited text file with column headers but no row names
(suitable for reading with read.delim
), containing the
following columns:
f
type frequencies f_k
k
optional: the corresponding type IDs k. If missing, increasing non-negative integers are automatically assigned as IDs.
type
optional: type representations (such as word forms or lemmas)
These columns may appear in any order in the text file. Only the
f
column is mandatory and all unrecognized columns will be
silently ignored.
If the filename file
ends in the extension .gz
, .bz2
pr .xz
,
the disk file will automatically be decompressed (read.tfl
) and compressed (write.tfl
).
The .tfl
file format stores neither the values of N
and
V
nor the range of type frequencies explicitly. Therefore,
incomplete type frequency lists cannot be fully reconstructed from
disk files (and will not even be recognized as such). An attempt to
save such a list will trigger a corresponding warning.
read.tfl
returns an object of class tfl
(see the
tfl
manpage for details)
## save type-frequency list for Brown corpus to external file fname <- tempfile(fileext=".tfl.gz") # automatically compresses file write.tfl(Brown.tfl, fname) ## file <fname> contains a compressed TAB-delimited table with fields ## k ... type ID (usually Zipf rank) ## f ... frequency of type ## type ... the type itself (here a word form) ## read it back in New.tfl <- read.tfl(fname) ## same as Brown.tfl summary(New.tfl) summary(Brown.tfl) print(New.tfl) print(Brown.tfl) head(New.tfl) head(Brown.tfl) stopifnot(isTRUE(all.equal(New.tfl, Brown.tfl))) # should by identical ## Not run: ## suppose you have a text file with a frequency list, one f per line, e.g.: ## f ## 14 ## 12 ## 31 ## ... ## you can import this with read.tfl MyData.tfl <- read.tfl("mylist.txt") summary(MyData.tfl) print(MyData.tfl) # ids in column k added by zipfR ## from this you can generate a spectrum with tfl2spc MyData.spc <- tfl2spc(MyData.tfl) summary(MyData.spc) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.