Detailed Statistical Description of Data Frame
descr
offers a concise description of each variable in a data frame. It is built as a wrapper around qsu
, but by default also computes frequency tables with percentages for categorical variables, and quantiles and the number of distinct values for numeric variables (next to the mean, sd, min, max, skewness and kurtosis computed by qsu
).
descr(X, Ndistinct = TRUE, higher = TRUE, table = TRUE, Qprobs = c(0.01, 0.05, 0.25, 0.5, 0.75, 0.95, 0.99), cols = NULL, label.attr = "label", ...) ## S3 method for class 'descr' print(x, n = 7, perc = TRUE, digits = 2, t.table = TRUE, summary = TRUE, ...) ## S3 method for class 'descr' as.data.frame(x, ...)
X |
a data frame or list of atomic vectors. Atomic vectors, matrices or arrays can be passed but will first be coerced to data frame using |
Ndistinct |
logical. |
higher |
logical. Argument is passed down to |
table |
logical. |
Qprobs |
double. Probabilities for quantiles to compute on numeric variables, passed down to |
cols |
select columns to describe using column names, indices, a logical vector or a function (i.e. |
label.attr |
character. The name of a label attribute to display for each variable (if variables are labeled). |
... |
other arguments passed to |
x |
an object of class 'descr'. |
n |
integer. The number of first and last entries to display of the table computed for categorical variables. If the number of distinct elements is |
perc |
logical. |
digits |
integer. The number of decimals to print in statistics and percentage tables. |
t.table |
logical. |
summary |
logical. |
qsu
itself is yet about 10x faster than descr
, and is optimized for grouped, panel data and weighted statistics. It is possible to also compute grouped, panel data and/or weighted statistics with descr
by passing group-ids to g
, panel-ids to pid
or a weight vector to w
. These arguments are handed down to qsu.default
and only affect the statistics natively computed by qsu
, i.e. passing a weight vector produces a weighted mean, sd, skewness and kurtosis but not weighted quantiles.
The list-object returned from descr
can be converted to a tidy data frame using as.data.frame
. This representation will not include frequency tables computed for categorical variables, and the method cannot handle arrays of statistics (applicable when g
or pid
arguments are passed to descr
, in that case as.data.frame.descr
will throw an appropriate error).
A 2-level nested list, the top-level containing the statistics computed for each variable, which are themselves stored in a list containing the class, the label, the basic statistics and quantiles / tables computed for the variable. The object is given a class 'descr' and also has the number of observations in the dataset attached as an 'N' attribute, as well as an attribute 'arstat' indicating whether the object contains arrays of statistics, and an attribute 'table' indicating whether table = TRUE
(i.e. the object could contain tables for categorical variables).
## Standard Use descr(iris) descr(wlddev) descr(GGDC10S) as.data.frame(descr(wlddev)) ## Passing Arguments down to qsu: For Panel Data Statistics descr(iris, pid = iris$Species) descr(wlddev, pid = wlddev$iso3c) ## Grouped Statistics descr(iris, g = iris$Species) descr(GGDC10S, g = GGDC10S$Region)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.