Distribute load
Utility function for distributing computations among a pool of workers for parallel processing.
distribute_load(x, n = 1)
x |
|
n |
|
This function returns a list
containing an element for
each worker. Each element contains a integer
vector
specifying the indices that the worker should process.
list
object.
## Not run: # imagine that we have 10 jobs that need processing. For simplicity, # our jobs will involve adding 1 to each element in 1:10. values <- 1:10 # we could complete this processing using the following vectorized code result <- 1 + 1:10 print(result) # however, if our jobs were complex then we would be better off using # functionals result <- lapply(1:10, function(x) x + 1) print(result) # we could do one better, and use the "plyr" package to handle the # processing result <- plyr::llply(1:10, function(x) x + 1) print(result) # we could also use the parallel processing options available through "plyr" # to use more computation resources to complete the jobs (note that since # these jobs are very quick to process this is actually slower). cl <- parallel::makeCluster(2, "PSOCK") doParallel::registerDoParallel(cl) result <- plyr::llply(1:10, function(x) x + 1, .parallel = TRUE) cl <- parallel::stopCluster(cl) print(result) # however this approach iterates over each element individually, we could # use the distribute_load function to split the N jobs up into K super # jobs, and evaluate each super job using vectorized code. x <- 1:10 cl <- parallel::makeCluster(2, "PSOCK") parallel::clusterExport(cl, 'x', envir = environment()) doParallel::registerDoParallel(cl) l <- distribute_load(length(x), n = 2) result <- plyr::llply(l, function(i) x[i] + 1, .parallel = TRUE) cl <- parallel::stopCluster(cl) print(result) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.