Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

panMatrix

Computing the pan-matrix for a set of gene clusters


Description

A pan-matrix has one row for each genome and one column for each gene cluster, and cell [i,j] indicates how many members genome i has in gene family j.

Usage

panMatrix(clustering)

Arguments

clustering

A named vector of integers.

Details

The pan-matrix is a central data structure for pan-genomic analysis. It is a matrix with one row for each genome in the study, and one column for each gene cluster. Cell [i,j] contains an integer indicating how many members genome i has in cluster j.

The input clustering must be a named integer vector with one element for each sequence in the study, typically produced by either bClust or dClust. The name of each element is a text identifying every sequence. The value of each element indicates the cluster, i.e. those sequences with identical values are in the same cluster. IMPORTANT: The name of each sequence must contain the genome_id for each genome, i.e. they must of the form GID111_seq1, GID111_seq2,... where the GIDxxx part indicates which genome the sequence belongs to. See panPrep for details.

The rows of the pan-matrix is named by the genome_id for every genome. The columns are just named Cluster_x where x is an integer copied from clustering.

Value

An integer matrix with a row for each genome and a column for each sequence cluster. The input vector clustering is attached as the attribute clustering.

Author(s)

Lars Snipen and Kristian Hovde Liland.

See Also

Examples

# Loading clustering data in this package
data(xmpl.bclst)

# Pan-matrix based on the clustering
panmat <- panMatrix(xmpl.bclst)

## Not run: 
# Plotting cluster distribution
library(ggplot2)
tibble(Clusters = as.integer(table(factor(colSums(panmat > 0), levels = 1:nrow(panmat)))),
       Genomes = 1:nrow(panmat)) %>% 
ggplot(aes(x = Genomes, y = Clusters)) +
geom_col()

## End(Not run)

micropan

Microbial Pan-Genome Analysis

v2.1
GPL-2
Authors
Lars Snipen and Kristian Hovde Liland
Initial release
2020-07-15

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.