Create a new data.frame for predict
Generate a new data.frame
or
matrix
from another with column(s)
selected by x
adopting n
values in
range(data[,x])
and all other columns
constant.
If canbeNumeric
(x) is TRUE
,
the output has x
adopting n
values in the range
(x) and all
other numeric variables at their
median
and other variables at
their most common values.
If canbeNumeric
(x) is FALSE
,
the output has x
adopting all possible
values of x
with all other variables at
the same constant values as when
canbeNumeric
(x) is TRUE
(and
n
is ignored). If x
has a
levels
attribute, the possible
values are defined by that levels
attribute. Otherwise, it is defined by
unique
(x).
This is designed to create a new
data.frame
to be used as
newdata
for predict
.
Newdata(data, x, n, na.rm=TRUE)
data |
a |
x |
name of a column of |
n |
an Default is 2 if If If |
na.rm |
1. Check data, x
.
2. If canbeNumeric
(x) is
TRUE
, let xNew
be n
values spanning range
(x). Else,
let
xNew
<- levels
(x).
4. let newDat <- data[rep(1, n), ]
,
and replace x
by xNew
.
5. otherVars <- colnames(data) != x
6. for(x2 in otherVars)
replace newDat[, x2]
:
If canbeNumeric
(x2) is TRUE
,
use median
(x2). Otherwise,
use its (first) most common value.
A data.frame
with n
rows and columns matching those of
data
, as described above.
Spencer Graves
## ## 1. A reasonable test with numerics, dates, ## an ordered factor and character variables ## xDate <- as.Date('2001-02-03')+1:4 tstDF <- data.frame(x1=1:4, xDate=xDate, xD2=as.POSIXct(xDate), sex=ordered(c('M', 'F', 'M', 'F')), huh=letters[c(1:3, 3)], stringsAsFactors=FALSE) newDat <- Newdata(tstDF, 'xDate', n=5) # check newD <- data.frame(x1=2.5, xDate=xDate[1]+seq(0, 3, length=5), xD2=as.POSIXct(xDate[2]+0.5), sex=ordered(c('M', 'F', 'M', 'F'))[2], huh=letters[3], stringsAsFactors=FALSE) attr(newD, 'out.attrs') <- attr(newDat, 'out.attrs') all.equal(newDat, newD) ## ## 2. Test with only one column ## newDat1 <- Newdata(tstDF[, 2, drop=FALSE], 'xDate', n=5) # check newDat1. <- newD[, 2, drop=FALSE] attr(newDat1., 'out.attrs') <- attr(newDat1, 'out.attrs') all.equal(newDat1, newDat1.) ## ## 3. Test with a factor ## newSex <- Newdata(tstDF, 'sex') # check newS <- with(tstDF, data.frame( x1=2.5, xDate=xDate[1]+1.5, xD2=as.POSIXct(xDate[1]+1.5), sex=ordered(c('M', 'F'))[2:1], huh=letters[3], stringsAsFactors=FALSE) ) attr(newS, 'out.attrs') <- attr(newSex, 'out.attrs') all.equal(newSex, newS) ## ## 4. Test with an integer column number ## newDat2 <- Newdata(tstDF, 2, n=5) # check all.equal(newDat2, newD) ## ## 5. Test with all ## NewAll <- Newdata(tstDF) # check tstLvls <- as.list(tstDF[c(1, 4), ]) tstLvls$sex <- tstDF$sex[2:1] tstLvls$huh <- letters[c(3, 1)] tstLvls$stringsAsFactors <- FALSE NewA. <- do.call(expand.grid, tstLvls) attr(NewA., 'out.attrs') <- attr(NewAll, 'out.attrs') all.equal(NewAll, NewA.)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.