Fast and accurate average genome size and 16S rRNA gene average copy number computation in metagenomic data


Contact
enquiries [ at ] symplectic.co.uk

Abstract

BACKGROUND: Metagenomics caused a quantum leap in microbial ecology. However, the inherent size and complexity of metagenomic data limit its interpretation. The quantification of metagenomic traits in metagenomic analysis workflows has the potential to improve the exploitation of metagenomic data. Metagenomic traits are organisms' characteristics linked to their performance. They are measured at the genomic level taking a random sample of individuals in a community. As such, these traits provide valuable information to uncover microorganisms' ecological patterns. The Average Genome Size (AGS) and the 16S rRNA gene Average Copy Number (ACN) are two highly informative metagenomic traits that reflect microorganisms' ecological strategies as well as the environmental conditions they inhabit. RESULTS: Here, we present the ags.sh and acn.sh tools, which analytically derive the AGS and ACN metagenomic traits. These tools represent an advance on previous approaches to compute the AGS and ACN traits. Benchmarking shows that ags.sh is up to 11 times faster than state-of-the-art tools dedicated to the estimation AGS. Both ags.sh and acn.sh show comparable or higher accuracy than existing tools used to estimate these traits. To exemplify the applicability of both tools, we analyzed the 139 prokaryotic metagenomes of TARA Oceans and revealed the ecological strategies associated with different water layers. CONCLUSION: We took advantage of recent advances in gene annotation to develop the ags.sh and acn.sh tools to combine easy tool usage with fast and accurate performance. Our tools compute the AGS and ACN metagenomic traits on unassembled metagenomes and allow researchers to improve their metagenomic data analysis to gain deeper insights into microorganisms' ecology. The ags.sh and acn.sh tools are publicly available using Docker container technology at https://github.com/pereiramemo/AGS-and-ACN-tools .



Item Type
Article
Authors
Publication Status
Published
Eprint ID
57493
DOI 10.1186/s12859-019-3031-y

Cite as
Pereira-Flores, E. , Glöckner, F. O. and Fernandez-Guerra, A. (2019): Fast and accurate average genome size and 16S rRNA gene average copy number computation in metagenomic data , BMC Bioinformatics, 20 (1), 453- . doi: 10.1186/s12859-019-3031-y


Download
[thumbnail of Fast and accurate average genome size and 16S rRNA gene average copy number computation in metagenomic data.pdf]
Preview
PDF
Fast and accurate average genome size and 16S rRNA gene average copy number computation in metagenomic data.pdf

Download (1MB) | Preview

Share
Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email


Citation


Actions
Edit Item Edit Item