Response probability using logistic regression
Computes the response probabilities using logistic regression for non-response adjustment. For stratified sampling, the same logistic model is used for all strata.
rmodel(formula,weights,X)
formula |
the regression model formula (y~x). |
weights |
vector of the weights; its length is equal to n, the sample size. |
X |
the sample data frame. |
The function returns the sample data frame with a new column 'prob_resp', which contains the response probabilities.
# Example from An and Watts (New SAS procedures for Analysis of Sample Survey Data) # generates artificial data (a 235X3 matrix with 3 columns: state, region, income). # the variable "state" has 2 categories ('nc' and 'sc'). # the variable "region" has 3 categories (1, 2 and 3). # the sampling frame is stratified by region within state. # the income variable is randomly generated data=rbind(matrix(rep("nc",165),165,1,byrow=TRUE),matrix(rep("sc",70),70,1,byrow=TRUE)) data=cbind.data.frame(data,c(rep(1,100), rep(2,50), rep(3,15), rep(1,30),rep(2,40)), 1000*runif(235)) names(data)=c("state","region","income") # computes the population stratum sizes table(data$region,data$state) # not run # nc sc # 1 100 30 # 2 50 40 # 3 15 0 # there are 5 cells with non-zero values; one draws 5 samples (1 sample in each stratum) # the sample stratum sizes are 10,5,10,4,6, respectively # the method is 'srswor' (equal probability, without replacement) s=strata(data,c("region","state"),size=c(10,5,10,4,6), method="srswor") # extracts the observed data x=getdata(data,s) # generates randomly the 'status' column (1 - respondent, 0 - nonrespondent) status=round(runif(nrow(x))) x=cbind(x,status) # computes the response probabilities rmodel(x$status~x$income+x$Stratum,weights=1/x$Prob,x) # the same example without stratification rmodel(x$status~x$income,weights=1/x$Prob,x)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.