Build a Mixture Hidden Markov Model
Function build_mhmm
constructs a mixture hidden Markov model object of class mhmm
.
build_mhmm(observations, n_states, transition_probs, emission_probs, initial_probs, formula, data, coefficients, cluster_names = NULL, state_names = NULL, channel_names = NULL, ...)
observations |
An |
n_states |
A numerical vector giving the number of hidden states in each submodel
(not used if starting values for model parameters are given with
|
transition_probs |
A list of matrices of transition probabilities for the submodel of each cluster. |
emission_probs |
A list which contains matrices of emission probabilities or
a list of such objects (one for each channel) for the submodel of each cluster.
Note that the matrices must have dimensions m x s where m is the number of
hidden states and s is the number of unique symbols (observed states) in the
data. Emission probabilities should follow the ordering of the alphabet of
observations ( |
initial_probs |
A list which contains vectors of initial state probabilities for the submodel of each cluster. |
formula |
Covariates as an object of class |
data |
An optional data frame, list or environment containing the variables
in the model. If not found in data, the variables are taken from
|
coefficients |
An optional k x l matrix of regression coefficients for time-constant covariates for mixture probabilities, where l is the number of clusters and k is the number of covariates. A logit-link is used for mixture probabilities. The first column is set to zero. |
cluster_names |
A vector of optional names for the clusters. |
state_names |
A list of optional labels for the hidden states. If |
channel_names |
A vector of optional names for the channels. |
... |
Additional arguments to |
The returned model contains some attributes such as nobs
and df
,
which define the number of observations in the model and the number of estimable
model parameters, used in computing BIC.
When computing nobs
for a multichannel model with C channels,
each observed value in a single channel amounts to 1/C observation,
i.e. a fully observed time point for a single sequence amounts to one observation.
For the degrees of freedom df
, zero probabilities of the initial model are
defined as structural zeroes.
Object of class mhmm
with following elements:
observations
State sequence object or a list of such containing the data.
transition_probs
A matrix of transition probabilities.
emission_probs
A matrix or a list of matrices of emission probabilities.
initial_probs
A vector of initial probabilities.
coefficients
A matrix of parameter coefficients for covariates (covariates in rows, clusters in columns).
X
Covariate values for each subject.
cluster_names
Names for clusters.
state_names
Names for hidden states.
symbol_names
Names for observed states.
channel_names
Names for channels of sequence data
length_of_sequences
(Maximum) length of sequences.
n_sequences
Number of sequences.
n_symbols
Number of observed states (in each channel).
n_states
Number of hidden states.
n_channels
Number of channels.
n_covariates
Number of covariates.
n_clusters
Number of clusters.
Helske S. and Helske J. (2019). Mixture Hidden Markov Models for Sequence Data: The seqHMM Package in R, Journal of Statistical Software, 88(3), 1-32. doi:10.18637/jss.v088.i03
fit_model
for fitting mixture Hidden Markov models;
summary.mhmm
for a summary of a MHMM; separate_mhmm
for
reorganizing a MHMM into a list of separate hidden Markov models; and
plot.mhmm
for plotting mhmm
objects.
data("biofam3c") ## Building sequence objects marr_seq <- seqdef(biofam3c$married, start = 15, alphabet = c("single", "married", "divorced")) child_seq <- seqdef(biofam3c$children, start = 15, alphabet = c("childless", "children")) left_seq <- seqdef(biofam3c$left, start = 15, alphabet = c("with parents", "left home")) ## Choosing colors attr(marr_seq, "cpal") <- c("#AB82FF", "#E6AB02", "#E7298A") attr(child_seq, "cpal") <- c("#66C2A5", "#FC8D62") attr(left_seq, "cpal") <- c("#A6CEE3", "#E31A1C") ## MHMM with random starting values, no covariates set.seed(468) init_mhmm_bf1 <- build_mhmm( observations = list(marr_seq, child_seq, left_seq), n_states = c(4, 4, 6), channel_names = c("Marriage", "Parenthood", "Residence")) ## Starting values for emission probabilities # Cluster 1 B1_marr <- matrix( c(0.8, 0.1, 0.1, # High probability for single 0.8, 0.1, 0.1, 0.3, 0.6, 0.1, # High probability for married 0.3, 0.3, 0.4), # High probability for divorced nrow = 4, ncol = 3, byrow = TRUE) B1_child <- matrix( c(0.9, 0.1, # High probability for childless 0.9, 0.1, 0.9, 0.1, 0.9, 0.1), nrow = 4, ncol = 2, byrow = TRUE) B1_left <- matrix( c(0.9, 0.1, # High probability for living with parents 0.1, 0.9, # High probability for having left home 0.1, 0.9, 0.1, 0.9), nrow = 4, ncol = 2, byrow = TRUE) # Cluster 2 B2_marr <- matrix( c(0.8, 0.1, 0.1, # High probability for single 0.8, 0.1, 0.1, 0.1, 0.8, 0.1, # High probability for married 0.7, 0.2, 0.1), nrow = 4, ncol = 3, byrow = TRUE) B2_child <- matrix( c(0.9, 0.1, # High probability for childless 0.9, 0.1, 0.9, 0.1, 0.1, 0.9), nrow = 4, ncol = 2, byrow = TRUE) B2_left <- matrix( c(0.9, 0.1, # High probability for living with parents 0.1, 0.9, 0.1, 0.9, 0.1, 0.9), nrow = 4, ncol = 2, byrow = TRUE) # Cluster 3 B3_marr <- matrix( c(0.8, 0.1, 0.1, # High probability for single 0.8, 0.1, 0.1, 0.8, 0.1, 0.1, 0.1, 0.8, 0.1, # High probability for married 0.3, 0.4, 0.3, 0.1, 0.1, 0.8), # High probability for divorced nrow = 6, ncol = 3, byrow = TRUE) B3_child <- matrix( c(0.9, 0.1, # High probability for childless 0.9, 0.1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.1, 0.9), nrow = 6, ncol = 2, byrow = TRUE) B3_left <- matrix( c(0.9, 0.1, # High probability for living with parents 0.1, 0.9, 0.5, 0.5, 0.5, 0.5, 0.1, 0.9, 0.1, 0.9), nrow = 6, ncol = 2, byrow = TRUE) # Starting values for transition matrices A1 <- matrix( c(0.80, 0.16, 0.03, 0.01, 0, 0.90, 0.07, 0.03, 0, 0, 0.90, 0.10, 0, 0, 0, 1), nrow = 4, ncol = 4, byrow = TRUE) A2 <- matrix( c(0.80, 0.10, 0.05, 0.03, 0.01, 0.01, 0, 0.70, 0.10, 0.10, 0.05, 0.05, 0, 0, 0.85, 0.01, 0.10, 0.04, 0, 0, 0, 0.90, 0.05, 0.05, 0, 0, 0, 0, 0.90, 0.10, 0, 0, 0, 0, 0, 1), nrow = 6, ncol = 6, byrow = TRUE) # Starting values for initial state probabilities initial_probs1 <- c(0.9, 0.07, 0.02, 0.01) initial_probs2 <- c(0.9, 0.04, 0.03, 0.01, 0.01, 0.01) # Birth cohort biofam3c$covariates$cohort <- cut(biofam3c$covariates$birthyr, c(1908, 1935, 1945, 1957)) biofam3c$covariates$cohort <- factor( biofam3c$covariates$cohort, labels=c("1909-1935", "1936-1945", "1946-1957")) ## MHMM with own starting values and covariates init_mhmm_bf2 <- build_mhmm( observations = list(marr_seq, child_seq, left_seq), initial_probs = list(initial_probs1, initial_probs1, initial_probs2), transition_probs = list(A1, A1, A2), emission_probs = list(list(B1_marr, B1_child, B1_left), list(B2_marr, B2_child, B2_left), list(B3_marr, B3_child, B3_left)), formula = ~sex + cohort, data = biofam3c$covariates, cluster_names = c("Cluster 1", "Cluster 2", "Cluster 3"), channel_names = c("Marriage", "Parenthood", "Residence"), state_names = list(paste("State", 1:4), paste("State", 1:4), paste("State", 1:6)))
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.