graphical comparison of the estimated distributions for the same continuous variable.
Compares graphically the estimated distributions for the same continuous variable using data coming from two different data sources.
plotCont(data.A, data.B, xlab.A, xlab.B=NULL, w.A=NULL, w.B=NULL,
type="density")data.A |
A dataframe or matrix containing the variable of interest |
data.B |
A dataframe or matrix containing the variable of interest |
xlab.A |
Character string providing the name of the variable in |
xlab.B |
Character string providing the name of the variable in |
w.A |
Character string providing the name of the optional weighting variable in |
w.B |
Character string providing the name of the optional weighting variable in |
type |
A character string indicating the type of graphical output that should be used to compare the estimated distributions of |
This function compares graphically distribution of the same variable but estimated from data coming from two different data sources. The graphical comparison con be done in four different manners. When type="density" the density plots are drawn; when available the weights are used in the estimation of the density.
The comparison is based on percentiles with type="qqplot" and type="qqshift". In the first case, the function draws a scatterplot (red dots) of the estimated percentiles of xlab.A vs. those of xlab.B; the dashed line indicated the ideal situation of equality of percentiles (points lying on the line). When type="qqshift" the scatterplot refers to (percentiles.A - percentiles.B) vs. percentiles.A; in this case the points lying on horizontal line passing through 0 indicate equality (difference equal to 0). Note that the number of estimated percentiles depends on the minimum between the two sample sizes. Only quartiles are calculated when min(n.A, n.B)<20, deciles are estimated when min(n.A, n.B)>=20 and min(n.A, n.B)<=30, finally quantiles for probs=seq(from = 0.05,to = 0.95,by = 0.05) are estimated when min(n.A, n.B)>30. When survey weights are available (indicated through w.A and/or w.B) they are used in estimating the quantiles by calling the function wtd.quantile in the package Hmisc.
Finally, when type="hist" the continuous variable is categorized and the corresponding histograms, estimated from data.A and data.B, are compared. Also in this case, when present, the weights are used in estimating the relative frequencies.
The required graphical representation is drawn using the ggplot2 facilities.
Marcello D'Orazio mdo.statmatch@gmail.com
# plotCont(data.A = samp.A, data.B = samp.B, xlab.A="age") # plotCont(data.A = samp.A, data.B = samp.B, xlab.A="age", w.A = "ww")
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.