URLhttps://www.bco-dmo.org/dataset/632784
Download URLhttps://www.bco-dmo.org/dataset/632784/data/download
Media Type text/tab-separated-values
Created January 18, 2016
Modified August 19, 2016
State Final no updates expected
Brief Description

Whole genome sequence data from bacterial isolates from venting fluids at NW Rota Seamount, collected in 2009 and 2010.

Acquisition Description

Diffuse hydrothermal vent fluids were collected at several vent sites on NW Rota-1 seamount in 2009 and 2010 using the ROV Jason 2 and the hydrothermal fluid and particle sampler. Anaerobic enrichment media previously used for the isolation of Caminibacter profundus was inoculated with 1 ml of unfiltered diffuse flow fluids and incubated at 55 degrees C. Enrichments with positive microbial growth were isolated by three sets of dilution-to-extinction. The growth of Lebetimonas under varying conditions including alternative electron donor/acceptor pairs and with N2 gas as the sole nitrogen source was evaluated as described in the Supplementary Material of Meyer & Huber (2014). Growth of Lebetimonas strain JH369 with N2 gas as the sole nitrogen source was evaluated using anaerobic seawater media without yeast extract or ammonia and containing formate and elemental sulfur with an 80% N2 and 20% CO2 headspace.

Genomic DNA was extracted from pure cultures at log phase using a CTAB extraction. Libraries were prepared using Nextera DNA sample prep kits (Illumina, San Diego, CA, USA) and sequenced by Roche 454 GS FLX Titanium (454 Life Sciences, Branford, CT, USA) and/or using Illumina HiSeq 2000 paired reads (Illumina). In the case of strains sequenced with multiple platforms, the same genomic DNA extraction was used for all library preparations, with the exception of strain JS085. Genomes were assembled using several tools as described in the Supplementary material of Meyer and Huber 2014.

Related references:
Meyer, J.L. and J.A. Huber. 2014. Strain-level genomic variation in natural populations of Lebetimonas from an erupting deep-sea volcano. ISME Journal. 8:867–880. doi:10.1038/ismej.2013.206

Processing Description

Prior to assembly, Illumina sequences were quality filtered using adaptive window trimming and a quality threshold of 30 using the script Trim.pl (http://wiki.bioinformatics.ucdavis.edu/index.php/Trim.pl). All reads were screened for adaptor, barcorde, primer, and transposan sequences and trimmed as needed using FASTX-Toolkit (http://hannonlab.cshl.edu/fastx_toolkit/index.html). De novo genome assembly was performed with several assembly programs. Sequences generated through the 454 platform were first assembled with Roche’s GS De Novo Assembler v 2.6 (“Newbler”) 2 using default parameters. De novo assemblies of 454 reads were also performed using mira 3 with the default settings for normal quality de novo genome assembly. De novo assembly of subsets of Illumina reads was performed with velvet 4, using an estimated coverage of 1000x, kmer size of 21, and a coverage cutoff of 5). Large contigs from Newbler , mira, and velvet were consolidated using Geneious Pro v 5.6.6 (Biomatters, Ltd, http://www.geneious.com) and aligned with progressiveMauve 5 to visualize the relationship of large contigs from different assemblies and to identify gaps to close. Primers were designed at the ends of contigs using either Geneious Pro or CLC Genomics Workbench v 5.1 (CLCbio, http://www.clcbio.com) to amplify gaps between contigs. Positive PCR amplification products linking contigs were cleaned using a Min-Elute PCR Purification kit (Qiagen) and Sanger sequenced. A nearly complete draft genome from strain JS085 served as a reference genome for the remaining five strains. Both Illumina and 454 reads were mapped to the reference genome with CLC Genomics Workbench. Unmapped reads were then assembled de novo to ensure that novel genomic content in the mapped strains was not overlooked. De novo assembly of 454 and/or Illumina reads for each strain was also performed in CLC Genomics Workbench and compared to the mapped assemblies using progressiveMauve.

Four of the strains were sequenced using both 454 and Illumina and two strains were sequenced only with Illumina. The sequencing coverage depth of quality-filtered reads ranged from 22X to 50X for 454 and up to 3618X for Illumina. Lebetimonas strain JS085 had the highest coverage of 454 reads and was assembled into 33 large contigs with Newbler and 1747 contigs with mira. The 20 largest contigs from each of these assemblies were consolidated using de novo assembly in Geneious to 10 contigs. An additional round of assembly in Geneious with the 10 consolidated contigs and velvet contigs greater than 10 Kbp further consolidated the draft genome to 6 contigs. Primers were designed for all possible combinations between the 6 contigs. One gap was closed using Sanger-sequenced positive pcr products. Finally, all 454 and Illumina reads for strain JS085 were mapped to the draft genome consisting of 5 contigs and the resulting consensus was used as the final draft genome. The five remaining genomes were assembled by mapping 454 and Illumina reads to the JS085 reference genome in CLC Genomics Workbench. Hybrid de novo assemblies in CLC Genomics Workbench of each strain did not extend contigs or close gaps between the 5 contigs of the draft genomes. Assemblies of unmapped reads produced only short contigs with no significant similarities using nucleotide BLAST 6.

BCO-DMO Processing:
– modified parameter names to conform with BCO-DMO naming conventions;
– added hyperlinks;
– removed “m” (meters) in depth column.

Instruments

Jason 2 [ROV Jason]
Details
The Remotely Operated Vehicle (ROV) Jason is operated by the Deep Submergence Laboratory (DSL) at Woods Hole Oceanographic Institution (WHOI). WHOI engineers and scientists designed and built the ROV Jason to give scientists access to the seafloor that didn't require them leaving the deck of the ship. Jason is a two-body ROV system. A 10-kilometer (6-mile) fiber-optic cable delivers electrical power and commands from the ship through Medea and down to Jason, which then returns data and live video imagery. Medea serves as a shock absorber, buffering Jason from the movements of the ship, while providing lighting and a bird’s eye view of the ROV during seafloor operations. During each dive (deployment of the ROV), Jason pilots and scientists work from a control room on the ship to monitor Jason’s instruments and video while maneuvering the vehicle and optionally performing a variety of sampling activities. Jason is equipped with sonar imagers, water samplers, video and still cameras, and lighting gear. Jason’s manipulator arms collect samples of rock, sediment, or marine life and place them in the vehicle’s basket or on "elevator" platforms that float heavier loads to the surface. More information is available from the operator site at URL.
Instance Description

Libraries were prepared using Nextera DNA sample prep kits (Illumina, San Diego, CA, USA) and sequenced by Roche 454 GS FLX Titanium (454 Life Sciences, Branford, CT, USA) and/or using Illumina HiSeq 2000 paired reads (Illumina).

General term for a laboratory instrument used for deciphering the order of bases in a strand of DNA. Sanger sequencers detect fluorescence from different dyes that are used to identify the A, C, G, and T extension reactions. Contemporary or Pyrosequencer methods are based on detecting the activity of DNA polymerase (a DNA synthesizing enzyme) with another chemoluminescent enzyme. Essentially, the method allows sequencing of a single strand of DNA by synthesizing the complementary strand along it, one base pair at a time, and detecting which base was actually added at each step.

Parameters

sequencing_center [unknown]
Details
sequencing_center
Name of sequencing center.
association with a community-wide standard parameter is not yet defined
domain [unknown]
Details
domain
Domain of sample.
association with a community-wide standard parameter is not yet defined
phylum [phylum]
Details
phylum
Taxonomic phylum.

phylum, organism taxonomic level. May include super-phylum and sub-phylum

class [class]
Details
class
Taxonomic class.
Class. Taxonomic class of organism. May include super-class and sub-class.
order [order]
Details
order
Taxonomic order.

order, organism taxonomic level; may include super-order and sub-order

family [family]
Details
family
Taxonomic family.

Family. One of the levels in the taxonomic system of classification; typically ends in 'ae'. . May include super-family and sub-family.

genus [genus]
Details
genus
Taxonomic genus.
taxonomic genus of organism
study_name [unknown]
Details
study_name
Name of study.
association with a community-wide standard parameter is not yet defined
sample_name [unknown]
Details
sample_name
Name/identifier of the sample.
association with a community-wide standard parameter is not yet defined
taxon_oid [unknown]
Details
taxon_oid
Taxon identier (OID).
association with a community-wide standard parameter is not yet defined
species [species]
Details
species
Species identifier.

a taxonomic binomial that consists of a genus name followed by the species name of an organism

NCBI_accession_num [unknown]
Details
NCBI_accession_num
NCBI accession number.
association with a community-wide standard parameter is not yet defined
accession_url [unknown]
Details
accession_url
Hyperlink to NCBI for the accession number.
association with a community-wide standard parameter is not yet defined
IMG_genome_ID [unknown]
Details
IMG_genome_ID
IMG database (http://img.jgi.doe.gov/) genome identifier.
association with a community-wide standard parameter is not yet defined
NCBI_taxon_ID [unknown]
Details
NCBI_taxon_ID
NCBI taxon identifier.
association with a community-wide standard parameter is not yet defined
IMG_submission_ID [unknown]
Details
IMG_submission_ID
IMG database (http://img.jgi.doe.gov/) submission identifier.
association with a community-wide standard parameter is not yet defined
GOLD_study_ID [unknown]
Details
GOLD_study_ID
GOLD database (https://gold.jgi.doe.gov/) study identifier.
association with a community-wide standard parameter is not yet defined
GOLD_study_url [unknown]
Details
GOLD_study_url
Hyperlink to GOLD database (https://gold.jgi.doe.gov/) for the study.
association with a community-wide standard parameter is not yet defined
GOLD_project_ID [unknown]
Details
GOLD_project_ID
GOLD database (https://gold.jgi.doe.gov/) project identifier.
association with a community-wide standard parameter is not yet defined
GOLD_project_url [unknown]
Details
GOLD_project_url
Hyperlink to GOLD database (https://gold.jgi.doe.gov/) for the project.
association with a community-wide standard parameter is not yet defined
GOLD_analysis_project_ID [unknown]
Details
GOLD_analysis_project_ID
GOLD database (https://gold.jgi.doe.gov/) analysis project identifier.
association with a community-wide standard parameter is not yet defined
GOLD_analysis_project_url [unknown]
Details
GOLD_analysis_project_url
Hyperlink to GOLD database (https://gold.jgi.doe.gov/) for the analysis project identifier.
association with a community-wide standard parameter is not yet defined
GOLD_analysis_project_type [unknown]
Details
GOLD_analysis_project_type
GOLD database (https://gold.jgi.doe.gov/) project type.
association with a community-wide standard parameter is not yet defined
gene_model_QC [unknown]
Details
gene_model_QC
Gene model QC? (yes/no)
association with a community-wide standard parameter is not yet defined
submission_type [unknown]
Details
submission_type
Submission type.
association with a community-wide standard parameter is not yet defined
strain [unknown]
Details
strain
Strain.
association with a community-wide standard parameter is not yet defined
is_public [unknown]
Details
is_public
Is the dataset public? (yes/no)
association with a community-wide standard parameter is not yet defined
high_quality [unknown]
Details
high_quality
Is it a high quality dataset? (yes/no)
association with a community-wide standard parameter is not yet defined
add_date [unknown]
Details
add_date
?
association with a community-wide standard parameter is not yet defined
biotic_relationships [unknown]
Details
biotic_relationships
Description of the biotic relationships.
association with a community-wide standard parameter is not yet defined
cell_shape [unknown]
Details
cell_shape
Description of the cell shape.
association with a community-wide standard parameter is not yet defined
contact_email [unknown]
Details
contact_email
Contact email address.
association with a community-wide standard parameter is not yet defined
contact_name [unknown]
Details
contact_name
Contact name.
association with a community-wide standard parameter is not yet defined
culture_type [unknown]
Details
culture_type
Culture type.
association with a community-wide standard parameter is not yet defined
cultured [unknown]
Details
cultured
Cultured? (yes/no)
association with a community-wide standard parameter is not yet defined
depth [depth]
Details
depth
Depth.

Observation/sample depth below the sea surface. Units often reported as: meters, feet.


When used in a JGOFS/GLOBEC dataset the depth is a best estimate; usually but not always calculated from pressure; calculated either from CTD pressure using Fofonoff and Millard (1982; UNESCO Tech Paper #44) algorithm adjusted for 1980 equation of state for seawater (EOS80) or simply equivalent to nominal depth as recorded during sampling if CTD pressure was unavailable.

ecosystem [unknown]
Details
ecosystem
Description of ecosystem.
association with a community-wide standard parameter is not yet defined
ecosystem_category [unknown]
Details
ecosystem_category
Description of ecosystem category.
association with a community-wide standard parameter is not yet defined
ecosystem_subtype [unknown]
Details
ecosystem_subtype
Description of ecosystem sub-type.
association with a community-wide standard parameter is not yet defined
ecosystem_type [unknown]
Details
ecosystem_type
Description of ecosystem type.
association with a community-wide standard parameter is not yet defined
energy_source [unknown]
Details
energy_source
Energy source.
association with a community-wide standard parameter is not yet defined
GOLD_sequencing_strategy [unknown]
Details
GOLD_sequencing_strategy
GOLD database (https://gold.jgi.doe.gov/) sequencing strategy.
association with a community-wide standard parameter is not yet defined
gram_staining [unknown]
Details
gram_staining
Type of gram staining.
association with a community-wide standard parameter is not yet defined
habitat [unknown]
Details
habitat
Description of habitat.
association with a community-wide standard parameter is not yet defined
isolation [unknown]
Details
isolation
Description of isolation.
association with a community-wide standard parameter is not yet defined
lat [latitude]
Details
lat
Latitude.

latitude, in decimal degrees, North is positive, negative denotes South; Reported in some datasets as degrees, minutes

longhurst_code [unknown]
Details
longhurst_code
Longhurst code.
association with a community-wide standard parameter is not yet defined
longhurst_descrip [unknown]
Details
longhurst_descrip
Longhurst description.
association with a community-wide standard parameter is not yet defined
lon [longitude]
Details
lon
Longitude.

longitude, in decimal degrees, East is positive, negative denotes West; Reported in some datsets as degrees, minutes

motility [unknown]
Details
motility
Motility.
association with a community-wide standard parameter is not yet defined
O2_requirement [unknown]
Details
O2_requirement
O2 requirements.
association with a community-wide standard parameter is not yet defined
project_name [unknown]
Details
project_name
Project name.
association with a community-wide standard parameter is not yet defined
relevance [unknown]
Details
relevance
Relevance.
association with a community-wide standard parameter is not yet defined
sporulation [unknown]
Details
sporulation
Type of sporulation.
association with a community-wide standard parameter is not yet defined
temp_range [unknown]
Details
temp_range
Description of temperature range.
association with a community-wide standard parameter is not yet defined
gene_count [unknown]
Details
gene_count
Gene count.
association with a community-wide standard parameter is not yet defined

Dataset Maintainers

NameAffiliationContact
Julie A. HuberMarine Biological Laboratory (MBL)
Shannon RauchMarine Biological Laboratory (MBL)

BCO-DMO Project Info

Project Title Functional gene diversity and expression in ocean crust microbial communities
Acronym NP Functional Gene Div
URLhttps://www.bco-dmo.org/project/637566
Created February 1, 2016
Modified February 1, 2016
Project Description

Project description from C-DEBI:
The objective of this project is to determine the diversity, phylogeny, and expression of functional genes involved in carbon, hydrogen, and sulfur cycling in North Pond crustal fluids. These formation fluids are expected to be representative of the ubiquitous cold ocean crust habitat, where reactions between the water and mineral rock surfaces create substrates suitable for sustaining a potentially large reservoir of microbial life. Information regarding crustal microbial communities and the energy sources available for microbial metabolism has been limited by the inaccessibility of samples. IODP Expedition 336 will provide a unique opportunity to access deep subsurface formation fluids from North Pond, including sampling from multiple depth horizons within oceanic crust. My goal is to develop quantitative polymerase chain reaction assays to determine the expression of functional genes in order to increase our understanding of microbial metabolisms in deep subsurface environments.

This project was funded by a C-DEBI Postdoctoral Fellowship to Julie Meyer (formerly at the Marine Biological Laboratory).

Data Project Maintainers
NameAffiliationRole
Julie L. MeyerUniversity of Florida (UF)Principal Investigator
Julie A. HuberMarine Biological Laboratory (MBL)Co-Principal Investigator
Menu