Abstract
The remineralization of organic material via heterotrophy in the marine environment is performed by a diverse and varied group of microorganisms that can specialize in the type of organic material degraded and the niche they occupy. The marine Dadabacteria are cosmopolitan in the marine environment and belong to a candidate phylum for which there has not been a comprehensive assessment of the available genomic data to date. Here in, we assess the functional potential of the marine pelagic Dadabacteria in comparison to members of the phylum that originate from terrestrial, hydrothermal, and subsurface environments. Our analysis reveals that the marine pelagic Dadabacteria have streamlined genomes, corresponding to smaller genome sizes and lower nitrogen content of their DNA and predicted proteome, relative to their phylogenetic counterparts. Collectively, the Dadabacteria have the potential to degrade microbial dissolved organic matter, specifically peptidoglycan and phospholipids. The marine Dadabacteria belong to two clades with apparent distinct ecological niches in global metagenomic data: a clade with the potential for photoheterotrophy through the use of proteorhodopsin, present predominantly in surface waters up to 100 m depth; and a clade lacking the potential for photoheterotrophy that is more abundant in the deep photic zone.
Abstract
As the importance of microbiome research continues to become more prevalent and essential to understanding a wide variety of ecosystems (e.g., marine, built, host-associated, etc.), there is a need for researchers to be able to perform highly reproducible and quality analysis of microbial genomes. MetaSanity incorporates analyses from eleven existing and widely used genome evaluation and annotation suites into a single, distributable workflow, thereby decreasing the workload of microbiologists by allowing for a flexible, expansive data analysis pipeline. MetaSanity has been designed to provide separate, reproducible workflows, that (1) can determine the overall quality of a microbial genome, while providing a putative phylogenetic assignment, and (2) can assign structural and functional gene annotations with varying degrees of specificity to suit the needs of the researcher. The software suite combines the results from several tools to provide broad insights into overall metabolic function. Importantly, this software provides built-in optimization for “big data” analysis by storing all relevant outputs in an SQL database, allowing users to query all the results for the elements that will most impact their research.
Abstract
Aerobic anoxygenic phototrophs (AAnPs) are common in marine environments and are associated with photoheterotrophic activity. To date, AAnPs that possess the potential for carbon fixation have not been identified in the surface ocean. Using the Tara Oceans metagenomic dataset, we have identified draft genomes of nine bacteria that possess the genomic potential for anoxygenic phototrophy, carbon fixation via the Calvin-Benson-Bassham cycle, and the oxidation of sulfite and thiosulfate. Forming a monophyletic clade within the Alphaproteobacteria and lacking cultured representatives, the organisms compose minor constituents of local microbial communities (0.1–1.0%), but are globally distributed, present in multiple samples from the North Pacific, Mediterranean Sea, the East Africa Coastal Province, and the Atlantic. This discovery may require re-examination of the microbial communities in the oceans to understand and constrain the role this group of organisms may play in the global carbon cycle.
Abstract
Microorganisms play a crucial role in mediating global biogeochemical cycles in the marine environment. By reconstructing the genomes of environmental organisms through metagenomics, researchers are able to study the metabolic potential of Bacteria and Archaea that are resistant to isolation in the laboratory. Utilizing the large metagenomic dataset generated from 234 samples collected during the Tara Oceans circumnavigation expedition, we were able to assemble 102 billion paired-end reads into 562 million contigs, which in turn were co-assembled and consolidated in to 7.2 million contigs ≥2 kb in length. Approximately 1 million of these contigs were binned to reconstruct draft genomes. In total, 2,631 draft genomes with an estimated completion of ≥50% were generated (1,491 draft genomes >70% complete; 603 genomes >90% complete). A majority of the draft genomes were manually assigned phylogeny based on sets of concatenated phylogenetic marker genes and/or 16S rRNA gene sequences. The draft genomes are now publically available for the research community at-large.
Abstract
The Tara Oceans Expedition has provided large, publicly-accessible microbial metagenomic datasets from a circumnavigation of the globe. Utilizing several size fractions from the samples originating in the Mediterranean Sea, we have used current assembly and binning techniques to reconstruct 290 putative draft metagenome-assembled bacterial and archaeal genomes, with an estimated completion of ≥50%, and an additional 2,786 bins, with estimated completion of 0–50%. We have submitted our results, including initial taxonomic and phylogenetic assignments, for the putative draft genomes to open-access repositories for the scientific community to use in ongoing research.
Abstract
Metagenomics has become an integral part of defining microbial diversity in various environments. Many ecosystems have characteristically low biomass and few cultured representatives. Linking potential metabolisms to phylogeny in environmental microorganisms is important for interpreting microbial community functions and the impacts these communities have on geochemical cycles. However, with metagenomic studies there is the computational hurdle of ‘binning’ contigs into phylogenetically related units or putative genomes. Binning methods have been implemented with varying approaches such as k-means clustering, Gaussian mixture models, hierarchical clustering, neural networks, and two-way clustering; however, many of these suffer from biases against low coverage/abundance organisms and closely related taxa/strains. We are introducing a new binning method, BinSanity, that utilizes the clustering algorithm affinity propagation (AP), to cluster assemblies using coverage with compositional based refinement (tetranucleotide frequency and percent GC content) to optimize bins containing multiple source organisms. This separation of composition and coverage based clustering reduces bias for closely related taxa. BinSanity was developed and tested on artificial metagenomes varying in size and complexity. Results indicate that BinSanity has a higher precision, recall, and Adjusted Rand Index compared to five commonly implemented methods. When tested on a previously published environmental metagenome, BinSanity generated high completion and low redundancy bins corresponding with the published metagenome-assembled genomes.