The SILVA Database Project: An ELIXIR core data resource for high-quality ribosomal RNA sequences
<jats:p>Ribosomal DNA (rDNA) has become the primary target molecule for phylogenetic reconstruction and the cultivation-independent detection and quantification of microorganisms (barcoding). With the advent of high-throughput sequencing technologies (Next Generation Sequencing (NGS), PCR-based amplicon sequencing of rDNA fragments for diversity screening is now a routine technology, at least in environmental sciences. The resulting exponential increase of publicly available rDNA sequences demands specialized reference databases (Fig. 1).</jats:p> <jats:p>SILVA (from Latin<jats:italic> silva</jats:italic>, meaning forest) is designed to provide a comprehensive web resource for up-to-date, quality-controlled databases of aligned rDNA sequences from the Bacteria, Archaea and Eukaryota<jats:italic> </jats:italic>domains and the corresponding online services (Glöckner et al. 2017, Quast et al. 2012).</jats:p> <jats:p>The current SILVA database (release 132) contains 6,073,181 small subunit and 907,382 large subunit rRNA gene sequences. All sequences are checked for anomalies, carry a rich set of sequence-associated contextual information, multiple taxonomic classifications (EMBL-EBI/ENA, RDP and GTDB) and the latest validly-described nomenclature. SILVA maintains manually curated reference alignments of 75,000 ribosomal RNA genes, both 16S/18S (small subunit, SSU) and 23S/28S (large subunit, LSU). With every full release, a manually curated guide tree is provided that contains the latest taxonomy and nomenclature based on multiple references.</jats:p> <jats:p>SILVA is the only rDNA database project worldwide where special emphasis is given to the consistent naming of clades of uncultivated (environmental) sequences where no validly-described cultivated representative is available (Yilmaz et al. 2014). SILVA incorporates other unique features, including a comprehensive 23S/28S database of aligned rDNA sequences and alignments that contain Eukaryota sequences.</jats:p> <jats:p>SILVA is an active partner of RNACentral. RNAcentral is a public resource that offers integrated access to a comprehensive and up-to-date set of non-coding RNA sequences provided by a collaborating group of Expert Databases. The SILVA team is a member of the Bergey’s Board of Trustees which provides the authoritative taxonomy for <jats:italic>Bacteria</jats:italic> and <jats:italic>Archaea</jats:italic> as well as the Protist Reference Taxonomy Project UniEuk funded by the Gordon and Betty Moore Foundation to create a unified taxonomic framework</jats:p> <jats:p>In 2018 SILVA became an ELIXIR Core Data Resource. ELIXIR Core Data Resources are a set of European data resources of fundamental importance to the wider life-science community and the long-term preservation of biological data.</jats:p> <jats:p>To facilitate classification tasks for high-throughput rDNA data the SILVAngs has been implemented and released in 2013. SILVAngs is a data analysis service for rDNA reads from high-throughput sequencing (NGS) approaches based on an automatic software pipeline. It uses the SILVA rDNA databases, taxonomies, and alignments as a reference. It facilitates the classification of rDNA reads and provides a wealth of results (tables, graphs and sequence files) for download. SILVAngs serves several thousands of registered users, which processes thousands of projects per year.</jats:p> <jats:p>The application spectrum of the SILVA databases ranges from environmental sciences, microbiology, agriculture, biochemistry, biotechnology to medicine in academia and industry.</jats:p>