48
Collections
166
Connections
47
Cross-Collection Projects
4
Explorer Projects

Collection Network

Click a collection to see its connections. Edges come from explicit links (schema relationships) and projects that use multiple collections (project co-usage).

📂 Arkinlab Dbcan 0 connections
📂 Arkinlab Microbeatlas 0 connections
📂 Arkinlab Mobilome 0 connections
📂 Bervodata Chess 0 connections
📂 Bervodata Fao Soils 0 connections
📂 Bervodata Hwsd2 0 connections
🪰 ENIGMA CORAL 15 connections
📂 Enigma Genome Depot Enigma 0 connections
📂 Kbase All The Bacteria 0 connections
💬 KBase Genomes 16 connections
🧬 Pangenome Collection 21 connections
ModelSEED Biochemistry 18 connections
📄 Ontologies 0 connections
🌱 Phenotype Collection 11 connections
📄 UniProt Annotations 14 connections
📄 Kbase Uniref100 11 connections
📄 Kbase Uniref50 13 connections
📄 Kbase Uniref90 11 connections
📂 Kescience Alphafold 19 connections
📂 Kescience Bacdive 20 connections
📊 Fitness Browser 21 connections
📂 Kescience Interpro 0 connections
📂 Kescience Mgnify 0 connections
📂 Kescience Paperblast 16 connections
📂 Kescience Pdb 19 connections
📂 Kescience Pubmed 0 connections
📂 Kescience Webofmicrobes 19 connections
📂 Msyscolo Grow 0 connections
📂 Netl Pw Dna 0 connections
🔬 NMDC Multi-omics 19 connections
📂 Nmdc Metadata 0 connections
🔬 NMDC BioSamples 5 connections
📂 Nmdc Results 0 connections
📂 Pangenome Bakta 0 connections
🦠 Phagefoundry Acinetobacter Genome Browser 14 connections
📂 Phagefoundry Ecoliphages Genomedepot 0 connections
📂 Phagefoundry Ecoliphagesgenomedepot 0 connections
🦠 Phagefoundry Klebsiella Genome Browser Genomedepot 0 connections
🦠 Phagefoundry Paeruginosa Genome Browser 5 connections
🦠 Phagefoundry Pviridiflava Genome Browser 0 connections
📂 Phagefoundry Strain Modelling 15 connections
🌊 Planetmicrobe Planetmicrobe 14 connections
🌊 Planetmicrobe Planetmicrobe Raw 0 connections
📂 Plantmicrobeinterfaces Gtdb Mapping 0 connections
🦠 PROTECT Pathogen Browser 16 connections
📂 Protect Integration 0 connections
📂 Protect Mind 0 connections
📂 Usgs Produced Waters 0 connections

Cross-Collection Join Paths

Documented ways to connect data across collections. Each path shows the relationship and linking strategy between two collections.

Source: enigma_coral

Target: kescience_fitnessbrowser

Bridging projects:

Source: kbase_genomes

Target: kbase_ke_pangenome

Bridging projects:

Source: kbase_phenotype

Target: kescience_fitnessbrowser

Bridging projects:

Source: nmdc_ncbi_biosamples

Target: nmdc_arkin

Bridging projects:

Explorer Project Highlights

Deep-dive explorations of BERDL collections, characterizing their content, cross-collection links, and research potential.

AlphaEarth Embeddings, Geography & Environment Explorer

Completed

What do AlphaEarth environmental embeddings capture, and how do they relate to geographic coordinates and NCBI environment labels?

Pangenome Collection
  • 1. Environmental samples show 3.4x stronger geographic signal than human-associated samples
  • 2. AlphaEarth embeddings encode real geographic signal — not noise
  • 3. Strong clinical/human sampling bias in the AlphaEarth subset
  • 4. 36% of coordinates flagged as potential institutional addresses
  • 5. UMAP reveals fine-grained embedding structure with environment-correlated clusters
  • 6. Embedding space also shows taxonomic structure
View full project →

PaperBLAST Data Explorer

Completed

What does the `kescience_paperblast` collection contain, how current is it, and what are its coverage patterns across organisms, domains of life, and functional databases?

Kescience Paperblast Fitness Browser
  • Finding 1: One organism dominates nearly half of all literature
  • Finding 2: 65.6% of genes have exactly one paper
  • Finding 3: Literature inequality is extreme — Lorenz curves
  • Finding 4: Bacterial research is concentrated on pathogens
  • Finding 5: 345K protein families from 816K sequences
  • Finding 6: 55% of protein families are dark or dim
View full project →

Web of Microbes Data Explorer

Completed

What does the `kescience_webofmicrobes` exometabolomics collection contain, which organisms overlap with the Fitness Browser, and how well do metabolite uptake/release profiles connect to pangenome-pr...

Fitness Browser Pangenome Collection ModelSEED Biochemistry Kescience Webofmicrobes
  • 1. WoM Action Encoding Uses Four Distinct Semantics, Not Three
  • 2. Two Direct Fitness Browser Strain Matches Plus Two Genus-Level Matches
  • 3. 19 WoM-Produced Metabolites Are Tested as FB Carbon/Nitrogen Sources
  • 4. 26.8% of WoM Metabolites Have Definitive ModelSEED Links (68.5% with Ambiguous Formula Matches)
  • 5. ENIGMA Isolates Show Distinct "Metabolic Novelty Rates"
  • 6. All WoM Genera Have Pangenome Species Clades
View full project →

Acinetobacter baylyi ADP1 Data Explorer

Completed

What is the scope and structure of a comprehensive ADP1 database, and how do its annotations, metabolic models, and phenotype data intersect with BERDL collections (pangenome, biochemistry, fitness, P...

Pangenome Collection Kbase Uniref50 Fitness Browser Phagefoundry Acinetobacter Genome Browser ModelSEED Biochemistry
  • 1. Rich Multi-Omics Database with 6 Data Modalities
  • 2. Strong BERDL Connectivity: 4 of 5 Connection Types at >90% Match
  • 3. Pangenome Cluster ID Bridge: 100% Mapping via Gene Junction Table
  • 4. FBA and TnSeq Essentiality Agree 74% of the Time
  • 5. Condition-Specific Fitness: Urea and Quinate Stand Apart
  • 6. Essential Genes Are 6x More Likely to Have COG Annotations
  • 7. Highly Conserved Core Metabolism Across 14 Genomes
  • 8. 87% of Growth Predictions Depend on Gapfilled Reactions
View full project →