Simulation of SNP data
Simulates SNP data, where a specified proportion of cases and controls is explained by specified set of SNP interactions. Can also be used to simulate a data set with a multi-categorical response, i.e.\ a data set in which the cases are divided into several classes (e.g., different diseases or subtypes of a disease).
simulateSNPs(n.obs, n.snp, vec.ia, prop.explain = 1, list.ia.val = NULL, vec.ia.num = NULL, vec.cat = NULL, maf = c(0.1, 0.4), prob.val = rep(1/3, 3), list.equal = NULL, prob.equal = 0.8, rm.redundancy = TRUE, shuffle = FALSE, shuffle.obs = FALSE, rand = NA)
n.obs |
either an integer specifying the total number of
observations, or a vector of length 2 specifying the number
of cases and the number of controls. If |
n.snp |
integer specifying the number of SNPs. |
vec.ia |
a vector of integers specifying the orders of the interactions
that explain the cases. |
prop.explain |
either an integer or a vector of |
list.ia.val |
a list of |
vec.ia.num |
a vector of |
vec.cat |
a vector of the same length of |
maf |
either an integer, or a vector of length 2 or |
prob.val |
a vector consisting of the probabilities for drawing a 0, 1, or 2,
if |
list.equal |
list of same structure as |
prob.equal |
a numeric value specifying the probability that a 1 is drawn when generating
|
rm.redundancy |
should redundant SNPs be removed from the explaining interactions?
It is possible that one specify an explaining i-way interaction, but an interaction
between (i-1) of the variables contained in the i-way
interaction already explains all the cases (and controls) that the i-way interaction
should explain. In this case, the redundant SNP is removed if |
shuffle |
logical. By default, the first |
shuffle.obs |
should the observations be shuffled? |
rand |
integer. Sets the random number generator in a reproducible state. |
An object of class simulatedSNPs
composed of
data |
a matrix with |
cl |
a vector of length |
tab.explain |
a table naming the explanatory interactions and the numbers of cases and controls explained by them. |
ia |
character vector naming the interactions. |
maf |
vector of length |
Currently, the genotypes of all SNPs are simulated independently from each other (except for the SNPs that belong to the same explanatory interaction).
Holger Schwender holger.schwender@udo.edu
## Not run: # Simulate a data set containing 2000 observations (1000 cases # and 1000 controls) and 50 SNPs, where one three-way and two # two-way interactions are chosen randomly to be explanatory # for the case-control status. sim1 <- simulateSNPs(2000, 50, c(3, 2, 2)) sim1 # Simulate data of 1200 cases and 800 controls for 50 SNPs, # where 90% of the observations showing a randomly chosen # three-way interaction are cases, and 95% of the observations # showing a randomly chosen two-way interactions are cases. sim2 <- simulateSNPs(c(1200, 800), 50, c(3, 2), prop.explain = c(0.9, 0.95)) sim2 # Simulate a data set consisting of 1000 observations and 50 SNPs, # where the minor allele frequency of each SNP is 0.25, and # the interactions # ((SNP1 == 2) & (SNP2 != 0) & (SNP3 == 1)) and # ((SNP4 == 0) & (SNP5 != 2)) # are explanatory for 200 and 250 of the 500 cases, respectively, # and for none of the 500 controls. list1 <- list(c(2, 0, 1), c(0, 2)) list2 <- list(c(1, 0, 1), c(1, 0)) sim3 <- simulateSNPs(1000, 50, c(3, 2), list.ia.val = list1, list.equal = list2, vec.ia.num = c(200, 250), maf = 0.25) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.