Placement of environmental sequences upon a reference phylogeny has been the "gold standard" method for taxonomic assignment of Sanger sequences. More recently, pyrosequencing technology has largely replaced Sanger methods in environmental DNA sequencing studies. Phylogenetic placement methods, as practiced earlier, became impractical with the data set sizes produced by pyrosequencing. Accordingly, variations of a workflow consisting of sequence clustering and taxonomic assignment based on k-mer statistics or pairwise alignment found widespread application. There are still reasons why read-by-read phylogenetic placement is expected to be superior to these methods though. In order to make this practicable for a large 16S rDNA V1 pyrosequencing data set consisting of about 2 million reads, we developed a software pipeline for the alignment and phylogenetic placement of large numbers of marker sequences. I will introduce this tool and illustrate it with results of our analyses of temporal patterns of SAR11 ecotype distributions at the Bermuda Atlantic Time Series site.
AWI Organizations > Infrastructure > Scientific Computing