Download URLhttps://www.bco-dmo.org/dataset/637878/data/download
Media Type text/tab-separated-values
Created February 4, 2016
Modified September 19, 2016
State Final no updates expected
Brief Description

Sub-seafloor single amplified genomes (SAGs) from anaerobic Peru Margin sediment.

Acquisition Description

A frozen deep-sea sediment sample of the Peruvian Margin drill site 1230 (ODP 201), collected 7.3 meters below seafloor (mbsf) and stored at −80 degrees C without glycerol preservation for 8 years, was used for single cell genome analysis. Physical isolation of the single cells was performed by Fluorescent Activated Cell Sorting in two 384-well plates (630 single cells, 6 positive controls and 132 negative controls). The sample processing was performed at the Bigelow Laboratory Single Cell Genomics Center. Single cells were lysed, and the DNA was amplified by MDA. In all, 250 wells showed good amplification with a Cp value of <10 h (∼40%). DNA was screened with broad eubacterial (27F-M13: 5′-AGRGTTYGATYMTGGCTCAG-3’/907R_degen-M13: 5′-CCGTCAATTCMTTTRAGTTT-3′) and archaeal (Arc_344F-M13: 5′-ACGGGGYGCAGCAGGCGCGA-3’/Arch_915R-M13R: 5′-GTGCTCCCCCGCCAATTCCT-3′) 16S rRNA primers and Sanger sequenced. Analysis with the RDB (Ribosomal Database) yielded 33 hits (5.2% of all single cells sorted, 13.2% of successful MDA reactions), including three Chloroflexi single cells that showed a 16S rRNA sequence by Sanger most similar to Dehalogenimonas. The first MDA products yielded 500–900 ng of DNA after clean up with the QIAamp DNA kit (Qiagen). The first MDA products of the three single cells were re-amplified in a second MDA. To avoid additional bias, the second MDA was performed in four separate reactions that were subsequently combined at the end.

The first MDA products of the single cells were sequenced separately on an Illumina HighSeq platform (San Diego, CA, USA) using Nextera library preparation with an average yield of 15 000 Mb and 150 000 000 reads with 2 × 100 bp read length. The second MDA products were sequenced using the PacBio RSMagbead CLR sequencing technique (Menlo Park, CA, USA), resulting in a mean read length of over 2.5 Kb and ∼100 Mb raw sequence data. Sequencing was carried out according to the manufacturer’s instructions and resulted in 12 Mb raw sequence data for single cell 1 and 190 Mb for single cells 2 and 3.

Processing Description

Different strategies were applied to assemble the reads of the individual cells, and later to combine single cells Dsc # 2 and # 3, in order to get the most out of the sequencing data. Statistics were checked with assemblathon. Since good assembly statistics do not automatically hold true that the assembly is optimal assemblies were always run through the RAST pipeline to check for misassemblies. SAGs were assembled by:

A. CLC bio
B. spades 2.3
C. spades-n
D. velvet-sc, kmer=37
E. velvet-sc n
F. Celera (CA)
G. Hybrid error correction method using CA assembled Illumina® data to correct long PacBio® reads
H. velvet assembly using Euler correction, kmer=55
I. spades assembly of Illumina®-only combined via PCAP with CA assembly of PacBio corrected by PacBio only
J. velvet-sc assembly of Illumina®-only combined via PCAP with CA assembly of PacBio® corrected by PacBio® only
K. velvet-sc assembly of Illumina®-only combined via PCAP with CA assembly of PacBio® corrected by Illumina®-only
L. spades assembly of Illumina®-only combined via PCAP with CA assembly of PacBio corrected by Illumina® only n = Normalization of the Illumina® reads

Single cells 2 and 3 were assembled together since they showed almost 100% identity at the nucleotide level after individual assembly. At this stage, a 0.32-Mb assembly was contained in 126 contigs for single cell 1 (Dsc1) and a 1.38-Mb assembly in 327 contigs for the co-assembly of single cells 2 and 3 (DscP2).

Assembled contigs were submitted to the Integrated Microbial Genomes database annotation pipeline (IMG, version 4.1) and to the Rapid Annotations using Subsystems Technology pipeline (RAST, version 4.0) in 2013. Some computationally assigned annotations were manually changed based on the inspection of evidence for the assigned annotations, orthologs in related genomes and gene neighborhoods. Pathways were predicted using RAST, IMG and KEGG (Kyoto Encyclopedia of Genes and Genomes). Nucleotide and amino-acid sequences of genes were blasted as query sequences against the NCBI databases.



organism [taxon]
Organism studied.

taxonomic group or entity. This may be a family, class, genus, species, etc.; usually this parameter will contain a mixture of taxonomic entities.

cruise_id [cruise_id]
Cruise identifier.
cruise designation; name
BioProject_ID [accession number]
NCBI BioProject identification number.
Database identifier assigned by repository and linked to GenBank or other repository.
accession_num [accession number]
NCBI accession number.
Database identifier assigned by repository and linked to GenBank or other repository.
BioProject_link [accession number]
Hyperlink to NCBI BioProject.
Database identifier assigned by repository and linked to GenBank or other repository.
accession_link [accession number]
Hyperlink to NCBI accession number.
Database identifier assigned by repository and linked to GenBank or other repository.

Dataset Maintainers

Alfred M. SpormannStanford University
Shannon RauchStanford University

BCO-DMO Project Info

Project Title Studying genomic and population biology of dehalogenating Chloroflexi in deep sea sediments by single cell sorting and single cell genome amplification
Acronym Chloroflexi in deep sea sediments
Created January 25, 2016
Modified January 25, 2016
Project Description

Project description from C-DEBI:
Dehalogenating Chloroflexi, such as Dehalococcoidites (Dhc) were originally discovered as the key microorganisms mediating reductive dehalogenation of the prevalent groundwater contaminants tetrachloroethene and trichloroethene. Molecular and genomic studies on their key enzymes for energy conservation, reductive dehalogenases (rdh), have provided evidence for ubiquitous horizontal gene transfer. A pioneering study by Futagami et al. discovered novel putative rdh phylotypes in sediments from the Pacific, revealing an unknown and surprising abundance of rdh genes in pristine habitats. The frequent detection of Dhc-related 16S rRNA genes from these environments implied the occurrence of dissimilatory dehalorespiration in marine subsurface sediments. Despite being ubiquitous in those environments, metabolic life style or ecological function of Dhc in the absence of anthropogenic contaminants is still completely unknown. We therefore analyzed a non-contaminated deep sea sediment sample of the Peru Margin 1230 site by a single cell genomic (SGC) approach. We were able get for the first time data on three single Dhc cells, helping to elucidate their role in the poorly understood oligotrophic marine sub-surface environment. Although all three single cells show the majority of their best Blast hits to Dhc species only one putative reductive dehalogenase was discovered, with very weak similarity to other known sequences. One of the reasons might be the incompleteness of the genome and rdh genes might have been missed. Another possibility is that deep sea Dhc are not halorespirers like their terrestrial relatives. Interestingly, when screening the DNA of other single cells, PCR shows a positive match for a rdh sequence in Firmicutes. This was quite an unexpected twist of the project.

This project was funded by a C-DEBI Postdoctoral fellowship awarded to Anne-Kristin Kaster.

Data Project Maintainers
Alfred M. SpormannStanford UniversityPrincipal Investigator