A first Glimpse at the genome of the Baikalian amphipod Eulimnogammarus verrucosus
Eulimnogammarus verrucosus is an amphipod endemic to the unique ecosystem of Lake Baikal and serves as an emerging model in ecotoxicological studies. We report here on a survey sequencing of its genome as a first step to establish sequence resources for this species. From a single lane of paired-end sequencing data, we estimated the genome size as nearly 10 Gb and we obtained an overview of the repeat content. At least two-thirds of the genome are non-unique DNA, and a third of the genomic DNA is composed of just five families of repetitive elements, including low-complexity sequences. Attempts to use off-the-shelf assembly tools failed on the available low-coverage data both before and after removal of highly repetitive components. Using a seed-based approach we nevertheless assembled short contigs covering 33 pre-microRNAs and the homeodomain-containing exon of nine Hox genes. The absence of clear evidence for paralogs implies that a genome duplication did not contribute to the large genome size. We furthermore report the assembly of the mitochondrial genome using a new, guided “crystallization” procedure. The initial results presented here set the stage for a more complete sequencing and analysis of this large genome.