micropan: chao – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

micropan

chao

The Chao lower bound estimate of pan-genome size

Description

Computes the Chao lower bound estimated number of gene clusters in a pan-genome.

Usage

chao(pan.matrix)

Arguments

pan.matrix

A pan-matrix, see panMatrix for details.

Details

The size of a pan-genome is the number of gene clusters in it, both those observed and those not yet observed.

The input pan.matrix is a a matrix with one row for each genome and one column for each observed gene cluster in the pan-genome. See panMatrix for how to construct this.

The number of observed gene clusters is simply the number of columns in pan.matrix. The number of gene clusters not yet observed is estimated by the Chao lower bound estimator (Chao, 1987). This is based solely on the number of clusters observed in 1 and 2 genomes. It is a very simple and conservative estimator, i.e. it is more likely to be too small than too large.

Value

The function returns an integer, the estimated pan-genome size. This includes both the number of gene clusters observed so far, as well as the estimated number not yet seen.

Author(s)

Lars Snipen and Kristian Hovde Liland.

References

Chao, A. (1987). Estimating the population size for capture-recapture data with unequal catchability. Biometrics, 43:783-791.

Examples

# Loading a pan-matrix in this package
data(xmpl.panmat)

# Estimating the pan-genome size using the Chao estimator
chao.pansize <- chao(xmpl.panmat)

micropan

Microbial Pan-Genome Analysis

v2.1

GPL-2

Authors

Lars Snipen and Kristian Hovde Liland

Initial release

2020-07-15

chao

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

micropan

We don't support your browser anymore