Fast (Weighted) F-test for Linear Models (with Factors)
fFtest
computes an R-squared based F-test for the exclusion of the variables in exc
, where the full (unrestricted) model is defined by variables supplied to both exc
and X
. The test is efficient and designed for cases where both exc
and X
may contain multiple factors and continuous variables.
fFtest(y, exc, X = NULL, w = NULL, full.df = TRUE, ...)
y |
a numeric vector: The dependent variable. |
exc |
a numeric vector, factor, numeric matrix or list / data frame of numeric vectors and/or factors: Variables to test / exclude. |
X |
a numeric vector, factor, numeric matrix or list / data frame of numeric vectors and/or factors: Covariates to include in both the restricted (without |
w |
numeric. A vector of (frequency) weights. |
full.df |
logical. If |
... |
other arguments passed to |
Factors and continuous regressors are efficiently projected out using fHDwithin
, and the option full.df
regulates whether a degree of freedom is subtracted for each used factor level (equivalent to dummy-variable estimator / expanding factors), or only one degree of freedom per factor (treating factors as variables). The test automatically removes missing values and considers only the complete cases of y, exc
and X
. Unused factor levels in exc
and X
are dropped.
Note that an intercept is always added by fHDwithin
, so it is not necessary to include an intercept in data supplied to exc
/ X
.
A 5 x 3 numeric matrix of statistics. The columns contain statistics:
the R-squared of the model
the numerator degrees of freedom i.e. the number of variables (k) and used factor levels if full.df = TRUE
the denominator degrees of freedom: N - k - 1.
the F-statistic
the corresponding P-value
The rows show these statistics for:
the Full (unrestricted) Model (y ~ exc + X
)
the Restricted Model (y ~ X
)
the Exclusion Restriction of exc
. The R-squared shown is simply the difference of the full and restricted R-Squared's, not the R-Squared of the model y ~ exc
.
If X = NULL
, only a vector of the same 5 statistics testing the model (y ~ exc
) is shown.
## We could use fFtest as a seasonality test: fFtest(AirPassengers, qF(cycle(AirPassengers))) # Testing for level-seasonality fFtest(AirPassengers, qF(cycle(AirPassengers)), # Seasonality test around a cubic trend poly(seq_along(AirPassengers), 3)) fFtest(fdiff(AirPassengers), qF(cycle(AirPassengers))) # Seasonality in first-difference ## A more classical example with only continuous variables fFtest(mtcars$mpg, mtcars[c("cyl","vs")], mtcars[c("hp","carb")]) ## Now encoding cyl and vs as factors fFtest(mtcars$mpg, dapply(mtcars[c("cyl","vs")], qF), mtcars[c("hp","carb")]) ## Using iris data: A factor and a continuous variable excluded fFtest(iris$Sepal.Length, iris[4:5], iris[2:3]) ## Testing the significance of country-FE in regression of GDP on life expectancy fFtest(wlddev$PCGDP, wlddev$iso3c, wlddev$LIFEEX) ## Ok, country-FE are significant, what about adding time-FE fFtest(wlddev$PCGDP, qF(wlddev$year), wlddev[c("iso3c","LIFEEX")]) # Same test done using lm: data <- na_omit(get_vars(wlddev, c("iso3c","year","PCGDP","LIFEEX"))) full <- lm(PCGDP ~ LIFEEX + iso3c + qF(year), data) rest <- lm(PCGDP ~ LIFEEX + iso3c, data) anova(rest, full)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.