Parallel Processing in the ergm Package
For estimation that require MCMC, ergm
can take advantage of multiple CPUs or CPU cores on the system on
which it runs, as well as computing clusters. It uses package
parallel and snow to facilitate this, and supports
all cluster types that they does. The number of nodes used and the
parallel API are controlled using the parallel and
parallel.type arguments passed to the control functions,
such as control.ergm.
The ergm.getCluster function is usually called
internally by the ergm process (in
ergm_MCMC_sample) and will attempt to start the
appropriate type of cluster indicated by the
control.ergm settings. It will also check that the
same version of ergm is installed on each node.
The ergm.stopCluster shuts down a
cluster, but only if ergm.getCluster was responsible for
starting it.
The ergm.restartCluster restarts and returns a cluster,
but only if ergm.getCluster was responsible for starting it.
nthreads is a simple generic to obtain the number of
parallel processes represented by its argument, keeping in mind
that having no cluster (e.g., NULL) represents one thread.
ergm.getCluster(control = NULL, verbose = FALSE, stop_on_exit = parent.frame()) ergm.stopCluster(..., verbose = FALSE) ergm.restartCluster(control = NULL, verbose = FALSE) nthreads(clinfo = NULL, ...) ## S3 method for class 'cluster' nthreads(clinfo = NULL, ...) ## S3 method for class ''NULL'' nthreads(clinfo = NULL, ...) ## S3 method for class 'control.list' nthreads(clinfo = NULL, ...)
control |
a |
verbose |
logical, should detailed status info be printed to console? |
stop_on_exit |
An |
... |
not currently used |
clinfo |
a |
Further details on the various cluster types are included below.
The parallel package is used with
PSOCK clusters by default, to utilize multiple cores on a
system. The number of cores on a system can be determined with
the detectCores function.
This method works with the base installation of R on all platforms, and does not require additional software.
For more advanced applications, such as clusters that span
multiple machines on a network, the clusters can be initialized
manually, and passed into ergm using the parallel
control argument. See the second example below.
To use MPI to accelerate ERGM sampling, pass
the control parameter parallel.type="MPI".
ergm requires the snow and
Rmpi packages to communicate with an MPI cluster.
Using MPI clusters requires the system to have an existing MPI installation. See the MPI documentation for your particular platform for instructions.
To use ergm across multiple machines in a high performance
computing environment, see the section "User initiated clusters"
below.
A cluster can be passed into
ergm with the parallel control parameter.
ergm will detect the number of nodes in the cluster, and
use all of them for MCMC sampling. This method is flexible: it
will accept any cluster type that is compatible with snow
or parallel packages.
# Uses 2 SOCK clusters for MCMLE estimation
data(faux.mesa.high)
nw <- faux.mesa.high
fauxmodel.01 <- ergm(nw ~ edges + isolates + gwesp(0.2, fixed=TRUE),
control=control.ergm(parallel=2, parallel.type="PSOCK"))
summary(fauxmodel.01)Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.