Estimation of the Population Total under Two Stage Simple Random Sampling Without Replacement
Computes the Horvitz-Thompson estimator of the population total according to a 2SI sampling design
E.2SI(NI, nI, Ni, ni, y, PSU)
NI |
Population size of Primary Sampling Units |
nI |
Sample size of Primary Sampling Units |
Ni |
Vector of population sizes of Secundary Sampling Units selected in the first draw |
ni |
Vector of sample sizes of Secundary Sampling Units |
y |
Vector, matrix or data frame containig the recollected information of the variables of interest for every unit in the selected sample |
PSU |
Vector identifying the membership to the strata of each unit in the population |
Returns the estimation of the population total of every single variable of interest, its estimated standard error and its estimated coefficient of variation
The function returns a data matrix whose columns correspond to the estimated parameters of the variables of interest
Hugo Andres Gutierrez Rojas hagutierrezro@gmail.com
Sarndal, C-E. and Swensson, B. and Wretman, J. (1992), Model Assisted Survey Sampling. Springer.
Gutierrez, H. A. (2009), Estrategias de muestreo: Dise?o de encuestas y estimacion de parametros.
Editorial Universidad Santo Tomas.
############ ## Example 1 ############ # Uses Lucy data to draw a twostage simple random sample # accordind to a 2SI design. Zone is the clustering variable data(Lucy) attach(Lucy) summary(Zone) # The population of clusters or Primary Sampling Units UI<-c("A","B","C","D","E") NI <- length(UI) # The sample size is nI=3 nI <- 3 # Selects the sample of PSUs samI<-S.SI(NI,nI) dataI<-UI[samI] dataI # The sampling frame of Secondary Sampling Unit is saved in Lucy1 ... Lucy3 Lucy1<-Lucy[which(Zone==dataI[1]),] Lucy2<-Lucy[which(Zone==dataI[2]),] Lucy3<-Lucy[which(Zone==dataI[3]),] # The size of every single PSU N1<-dim(Lucy1)[1] N2<-dim(Lucy2)[1] N3<-dim(Lucy3)[1] Ni<-c(N1,N2,N3) # The sample size in every PSI is 135 Secondary Sampling Units n1<-135 n2<-135 n3<-135 ni<-c(n1,n2,n3) # Selects a sample of Secondary Sampling Units inside the PSUs sam1<-S.SI(N1,n1) sam2<-S.SI(N2,n2) sam3<-S.SI(N3,n3) # The information about each Secondary Sampling Unit in the PSUs # is saved in data1 ... data3 data1<-Lucy1[sam1,] data2<-Lucy2[sam2,] data3<-Lucy3[sam3,] # The information about each unit in the final selected sample is saved in data data<-rbind(data1, data2, data3) attach(data) # The clustering variable is Zone Cluster <- as.factor(as.integer(Zone)) # The variables of interest are: Income, Employees and Taxes # This information is stored in a data frame called estima estima <- data.frame(Income, Employees, Taxes) # Estimation of the Population total E.2SI(NI,nI,Ni,ni,estima,Cluster) ######################################################## ## Example 2 Total Census to the entire population ######################################################## # Uses Lucy data to draw a cluster random sample # accordind to a SI design ... # Zone is the clustering variable data(Lucy) attach(Lucy) summary(Zone) # The population of clusters UI<-c("A","B","C","D","E") NI <- length(UI) # The sample size equals to the population size of PSU nI <- NI # Selects every single PSU samI<-S.SI(NI,nI) dataI<-UI[samI] dataI # The sampling frame of Secondary Sampling Unit is saved in Lucy1 ... Lucy5 Lucy1<-Lucy[which(Zone==dataI[1]),] Lucy2<-Lucy[which(Zone==dataI[2]),] Lucy3<-Lucy[which(Zone==dataI[3]),] Lucy4<-Lucy[which(Zone==dataI[4]),] Lucy5<-Lucy[which(Zone==dataI[5]),] # The size of every single PSU N1<-dim(Lucy1)[1] N2<-dim(Lucy2)[1] N3<-dim(Lucy3)[1] N4<-dim(Lucy4)[1] N5<-dim(Lucy5)[1] Ni<-c(N1,N2,N3,N4,N5) # The sample size of Secondary Sampling Units equals to the size of each PSU n1<-N1 n2<-N2 n3<-N3 n4<-N4 n5<-N5 ni<-c(n1,n2,n3,n4,n5) # Selects every single Secondary Sampling Unit inside the PSU sam1<-S.SI(N1,n1) sam2<-S.SI(N2,n2) sam3<-S.SI(N3,n3) sam4<-S.SI(N4,n4) sam5<-S.SI(N5,n5) # The information about each unit in the cluster is saved in Lucy1 ... Lucy5 data1<-Lucy1[sam1,] data2<-Lucy2[sam2,] data3<-Lucy3[sam3,] data4<-Lucy4[sam4,] data5<-Lucy5[sam5,] # The information about each Secondary Sampling Unit # in the sample (census) is saved in data data<-rbind(data1, data2, data3, data4, data5) attach(data) # The clustering variable is Zone Cluster <- as.factor(as.integer(Zone)) # The variables of interest are: Income, Employees and Taxes # This information is stored in a data frame called estima estima <- data.frame(Income, Employees, Taxes) # Estimation of the Population total E.2SI(NI,nI,Ni,ni,estima,Cluster) # Sampling error is null
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.