Import documents from a directory into Mallet format
This function takes a directory path as its only argument
and returns a data.frame()
with two columns: <id> & <text>,
which can be passed to the mallet.import
function.
This data.frame()
has as many rows as there are files in the Dir
.
mallet.read.dir(Dir)
Dir |
The path to a directory containing one document per file. |
This function was contributed to RMallet by Dan Bowen.
## Not run: documents <- mallet.read.dir(Dir) mallet.instances <- mallet.import(documents$id, documents$text, "en.txt", token.regexp = "\\p{L}[\\p{L}\\p{P}]+\\p{L}") ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.