NMDC Multi-omics
nmdc_arkin
Tenant: NMDC · Snapshot 2026-04-29T21:35:53.838483+00:00
Schema status
discoveredCuration status
curatedSource
berdl-spark-connect://metrics.berdl.kbase.us:443Philosophy
Enable integrated microbiome analysis across multiple omics layers. Combine metabolomics, proteomics, and metagenomics with standardized annotations and embeddings for comprehensive sample characterization.
Citation & Attribution
Provider: NMDC
Website: https://microbiomedata.org/
Scale
Schema Browser
Tables (63)
abiotic_embeddings abiotic_features annotation_crossrefs annotation_hierarchies_unified annotation_terms_unified biochemical_embeddings biochemical_features biochemical_features_metadata centrifuge_gold cog_categories cog_hierarchy_flat contig_taxonomy contig_taxonomy_backup covstats_taxonomy_rollup ec_hierarchy_flat ec_hierarchy_graph ec_terms embedding_metadata embeddings_v1 go_hierarchy_flat go_hierarchy_graph go_terms gottcha_gold kegg_ko_module kegg_ko_pathway kegg_ko_terms kegg_module_terms kegg_pathway_terms kraken_gold lipidomics_gold metabolomics_gold metacyc_hierarchy_flat metacyc_hierarchy_graph metacyc_pathway_reactions metacyc_pathways metacyc_reaction_ec metatranscriptomics_gold nom_feature_metadata nom_gold nom_matrix_optimized omics_files_table proteomics_gold rhea_crossrefs rhea_reactions sample_file_lookup sample_file_selections sample_tokens_v1 study_table taxonomy_dim taxonomy_embeddings taxonomy_family_embeddings taxonomy_features taxonomy_genus_embeddings taxonomy_order_embeddings taxonomy_phylum_embeddings taxstring_lookup trait_embeddings trait_features trait_sources trait_taxonomy_mapping trait_unified unified_embeddings vocab_registry_v1
go_hierarchy_flat
| Column | Type | Description |
|---|
Sample Queries
Get NMDC studies
SELECT *
FROM nmdc_arkin.study_table
Get metabolomics data
SELECT *
FROM nmdc_arkin.metabolomics_gold
LIMIT 20
Related Collections
Projects Using This Collection
ENIGMA Carbon Census 1
For 83 groundwater- and necromass-derived carbon compounds proposed for community enrichment and isolate phenotyping, wh...
BERDL Data Atlas — Inventory, Topic Map, and Cross-Reference Synergies
What data is available in BERDL (across tenants, agencies, and programs), what biological topics does it cover, and wher...
Harvard Forest Long-Term Warming — DNA vs RNA Functional Response
After ~25 years of +5°C experimental soil warming at the Harvard Forest Barre Woods plot, does the functional transcript...
Gene Function Ecological Agora
Across the prokaryotic tree (GTDB r214; 293,059 genomes / 27,690 species), build a multi-resolution **innovation + acqui...
Plant Microbiome Ecotypes
What is the genomic basis for plant-microbe associations across different plant compartments (rhizosphere, root, phyllos...
Environmental Resistome at Pangenome Scale
Do antimicrobial resistance gene profiles differ between ecological niches across 27,000 bacterial species? Using 83K AM...
Functional Dark Matter — Experimentally Prioritized Novel Genetic Systems
Which genes of unknown function across 48 bacteria have strong fitness phenotypes, and can biogeographic patterns, pathw...
Community Metabolic Ecology via NMDC × Pangenome Integration
Do the GapMind-predicted pathway completeness profiles of community resident taxa predict or correlate with observed met...
Prophage Gene Modules and Terminase-Defined Lineages Across Bacterial Phylogeny and Environmental Gradients
How are prophage gene modules and terminase-defined prophage lineages distributed across bacterial phylogeny and environ...
Polyhydroxybutyrate Granule Formation Pathways: Distribution Across Clades and Environmental Selection
How are polyhydroxybutyrate (PHB) granule-forming pathways distributed across bacterial clades and environments, and doe...
Atlas Pages
Ecotype labels versus translational leakage
Ecotype labels are reusable stratification products, but translational target lists can collapse when labels and outcomes share leaked or confounded features.
metaBERDL Data Atlas
Entry point for BERDL tenants, collections, data types, derived products, join recipes, reuse patterns, and missing complementary data.
data typeMulti-Omics, Embeddings, and Molecular Profiles
Metabolomics, proteomics, trait profiles, embeddings, and other matrix-style summaries that create reusable sample or organism representations.
derived productEcotype Assignments
Reusable within-species or community ecotype labels that support environmental validation, microbiome stratification, and downstream hypothesis tests.
derived productEnvironment Harmonization Labels
Reusable environment category and coordinate-quality labels that make cross-collection ecology joins safer.
data collectionNMDC Multi-omics
Multi-omics analysis data (annotations, embeddings, metabolomics, proteomics, traits)
opportunityPlant Microbiome Function Validation
Validate whether plant microbiome functional signals persist across ecotype labels, pangenome context, and environmental metadata.
topicMicrobial Ecotypes, Environment, and Field Validation
Synthesis of species-level ecotypes, environmental embeddings, lab-field validation, ENIGMA ecology, and metadata limitations.
topicPlant Microbiome Function and Agriculture
Synthesis of plant-associated microbial function, beneficial/pathogenic duality, compartment structure, PGP markers, and pangenome ecology.
topicMetabolic Capability, Dependency, and Community Design
Synthesis of GapMind capability, fitness dependency, metabolic models, community ecology, and design-ready derived data.
Atlas Reuse
Ecotype Assignments
promoted · 2 downstream projects
label_setEnvironment Harmonization Labels
promoted · 2 downstream projects
partially resolvedEcotype labels versus translational leakage
Ecotype labels are reusable stratification products, but translational target lists can collapse when labels and outcomes share leaked or confounded features.