BERIL Atlas
A navigable synthesis layer over BERIL projects, collections, claims, derived products, and research opportunities.
Start Here
Open overview pageScience and application maps
Narrative synthesis across projects, with drill-down into claims, directions, hypotheses, and data.
DataCollections and reusable products
BERDL tenants, databases, data types, join recipes, derived products, reuse signals, and gaps.
ReuseProject and derived-product graph
Producers, consumers, artifacts, review routes, and products that need downstream use.
ClaimsEvidence-backed primitives
Reusable statements that preserve provenance and caveats for future humans and agents.
TensionsConflicts and resolving work
Places where results differ, caveats change interpretation, or new experiments are needed.
OpportunitiesNext analyses to run
Actionable analysis and experiment seeds grounded in tensions, products, data gaps, and claims.
HypothesesTestable next steps
Concrete project and experiment seeds grounded in what the observatory already knows.
Generated Maps
Built from Atlas frontmatter and inventory signalsTopic Landscape
Science topics are the first interpretive map: each one connects projects to reusable claims, directions, hypotheses, and data.
Progressive synthesis of metal fitness, tolerance, validation, and critical-mineral research opportunities.
Synthesis of AMR gene distribution, fitness cost, cofitness support networks, environment structure, and metal co-selection opportunities.
Cross-project synthesis of pangenome openness, core/accessory structure, functional composition, conservation, and gene-content tradeoffs.
Synthesis of essential genes, metabolic dependency, ICA modules, dark genes, and functional annotation repair through fitness evidence.
9 topics integrate 69 project references.
Data Landscape
Data pages separate physical BERDL collections from cross-collection analytical roles and reusable outputs.
Pangenome data for 293,059 genomes across 27,690 microbial species derived from GTDB r214. Includes core/accessory gene classification, functional annotations, and ANI relationships.
Gene fitness data from transposon mutant experiments across 40+ bacterial organisms. Identify essential genes and condition-specific fitness effects.
Subsurface microbial ecology and geochemistry data from the ENIGMA SFA project at the Oak Ridge Reservation (ORR), Tennessee. Covers environmental sampling (groundwater from boreholes), 16S amplicon community profiling, isolated bacterial strains with genomes, and geochemical measurements. Includes the SSO (subsurface observatory) site with spatial and temporal community profiles.
Discovered BERDL database. Add curated description as this collection is used.
48/48 collections have Atlas pages.
Claim to Experiment Flow
The Atlas should let a reader move from synthesis to reusable claims, then into directions and concrete hypotheses.
Evidence-backed statements reused across topics.
High-value research opportunities grounded in existing work.
Testable units that can become projects or experiments.
Concrete next analyses prioritized from Atlas evidence and reusable products.
8 claims, 5 directions, 8 hypotheses, and 12 opportunities are currently indexed.
Derived Product Reuse
Reusable scores, labels, mappings, and joins are the outputs most likely to compound across projects.
Reusable species and gene-level metal tolerance signals derived from metal fitness projects and environmental validation work.
Reusable within-species or community ecotype labels that support environmental validation, microbiome stratification, and downstream hypothesis tests.
Reusable environment category and coordinate-quality labels that make cross-collection ecology joins safer.
Reusable ranked dark-gene candidates, covering sets, and experiment plans derived from fitness, pangenome, annotation, and ecology evidence.
8 derived products are currently tracked from 74 data pages.
Tension and Resolution
Apparent conflicts are preserved as reviewable objects with evidence on multiple sides and resolving work.
Lab-derived fitness or tolerance scores can predict ecology in some settings, but metadata quality and field complexity limit broad generalization.
Ecotype labels are reusable stratification products, but translational target lists can collapse when labels and outcomes share leaked or confounded features.
Metal fitness hits include both specific metal biology and broad stress response, so engineering targets need specificity filters and counter-ion controls.
BERIL has the pieces to test metal-AMR co-selection, but the current evidence is a strong opportunity rather than a resolved result.
4 tension pages track where synthesis needs reconciliation.
Metrics To Watch
Open metrics methodCollection Coverage
Canonical BERDL databases with data_collection Atlas pages.
Cross-Collection Reuse
Projects that combine two or more BERDL collections.
Under-Explored Collections
Collections with no parsed project references yet.
Dark-Matter Metadata
Collections needing stronger curation or complete schema discovery.
Caveat Load
Low-confidence Atlas pages that need review before heavy reuse.
Evidence Coverage
Claims, directions, hypotheses, derived products, and opportunities with evidence metadata.
Derived Product Reuse
Promoted derived products with at least one declared downstream project.
Unresolved Tensions
Conflict pages still needing resolving analysis or experiments.
Opportunity Coverage
Concrete next analyses connected to Atlas evidence, products, and tensions.
Blocked Opportunities
Opportunity pages that cannot proceed until prerequisite data or review is available.
Topic Drill-Down Depth
Topic pages with enough sections and related links to support progressive disclosure.
Critical Minerals and Metal Biology
Progressive synthesis of metal fitness, tolerance, validation, and critical-mineral research opportunities.
AMR, Resistance Ecology, and Co-selection
Synthesis of AMR gene distribution, fitness cost, cofitness support networks, environment structure, and metal co-selection opportunities.
Pangenome Architecture and Gene-Content Evolution
Cross-project synthesis of pangenome openness, core/accessory structure, functional composition, conservation, and gene-content tradeoffs.
Fitness-Validated Gene Function
Synthesis of essential genes, metabolic dependency, ICA modules, dark genes, and functional annotation repair through fitness evidence.
Microbial Ecotypes, Environment, and Field Validation
Synthesis of species-level ecotypes, environmental embeddings, lab-field validation, ENIGMA ecology, and metadata limitations.
Host Microbiome Translation
Synthesis of IBD phage targeting, formulation design, metabolomics caveats, patient stratification, and intervention cost accounting.
Plant Microbiome Function and Agriculture
Synthesis of plant-associated microbial function, beneficial/pathogenic duality, compartment structure, PGP markers, and pangenome ecology.
Mobile Elements, Phage, and Genome Plasticity
Synthesis of phage ecology, prophage signals, defense systems, mobile-element gene flow, and intervention relevance.
Metabolic Capability, Dependency, and Community Design
Synthesis of GapMind capability, fitness dependency, metabolic models, community ecology, and design-ready derived data.
Collections
48 pagesPangenome Collection
Genome and pangenome tables for comparative genomics, conservation, openness, annotation, and cross-project derived products.
Fitness Browser Collection
RB-TnSeq fitness evidence used to validate gene function, stress response, essentiality, cofitness, and pathway dependency.
Genomes
Structural genomics data (contigs, features, protein sequences) in CDM format
Biochemistry
ModelSEED biochemical reactions, compounds, and stoichiometry for metabolic modeling
Phenotype
Experimental phenotype data (growth conditions and measurements)
UniProt
UniProt protein identifier cross-references
Data Types
10 pagesGenomes and Pangenomes
Data type lens for genome metadata, pangenome structure, annotations, and species-level comparative genomics.
Fitness Phenotypes
Data type lens for RB-TnSeq phenotypes, condition-specific gene effects, cofitness, essentiality, and pathway dependency.
Genes, Proteins, and Annotations
Gene and protein identifiers, functional annotation, literature coverage, orthology, and controlled vocabulary layers used to connect raw genomes to interpretable biology.
Experimental Phenotypes, Traits, and Assays
Measured organism and gene behavior across laboratory conditions, including mutant fitness, growth phenotypes, utilization traits, and exometabolomic observations.
Metabolism, Biochemistry, and Pathways
Biochemical reference data and predicted or measured metabolic capabilities that support pathway-level interpretation and community design.
Environment, Geochemistry, and Ecology
Environmental samples, coordinates, geochemistry, sample metadata, and ecology-facing observations used to validate laboratory predictions in the field.
Derived Products
8 pagesMetal Tolerance Scores
Reusable species and gene-level metal tolerance signals derived from metal fitness projects and environmental validation work.
Ecotype Assignments
Reusable within-species or community ecotype labels that support environmental validation, microbiome stratification, and downstream hypothesis tests.
Environment Harmonization Labels
Reusable environment category and coordinate-quality labels that make cross-collection ecology joins safer.
Dark Gene Prioritization Tables
Reusable ranked dark-gene candidates, covering sets, and experiment plans derived from fitness, pangenome, annotation, and ecology evidence.
Functional Innovation KO Atlas
Reusable clade-level functional innovation and acquisition-depth outputs from the ecological agora project.
AMR Fitness Profiles
Reusable AMR mechanism, conservation, environment, and fitness-cost signals for resistance ecology questions.
Joins
1 pagesGaps
2 pagesMissing SSO Geochemistry
ENIGMA SSO registered sample tubes exist, but linked geochemistry measurements are missing from the lakehouse.
Rare Earth Fitness Data Gap
Rare-earth-element fitness experiments appear absent, making cross-metal inference a prediction task rather than validated biology.
Lab fitness signals versus field ecology
Lab-derived fitness or tolerance scores can predict ecology in some settings, but metadata quality and field complexity limit broad generalization.
Ecotype labels versus translational leakage
Ecotype labels are reusable stratification products, but translational target lists can collapse when labels and outcomes share leaked or confounded features.
Metal specificity versus general stress
Metal fitness hits include both specific metal biology and broad stress response, so engineering targets need specificity filters and counter-ion controls.
Metal-AMR co-selection readiness
BERIL has the pieces to test metal-AMR co-selection, but the current evidence is a strong opportunity rather than a resolved result.
Ecotype Label Validation Benchmark
Build a benchmark that tests whether ecotype labels survive stricter metadata, batch, and holdout validation.
Metal-AMR Site Co-Selection Analysis
Test whether metal contamination at BER-relevant sites co-selects antibiotic resistance mechanisms using metal fitness, AMR profiles, and environmental metadata.
Dark Gene Structure Prioritization
Prioritize dark gene families for mechanistic review by joining fitness, cofitness, annotation novelty, and AlphaFold structure signals.
Lab-to-Field Fitness Transfer Audit
Audit where laboratory fitness effects predict field ecology and where geochemistry, taxonomy, or metadata completeness blocks transfer.
Rare-Earth RB-TnSeq Design
Design the first rare-earth fitness experiment by ranking candidate genes from cross-metal specificity, conservation, annotation, and structure evidence.
CF Formulation Score Reuse Test
Find a first downstream consumer for CF formulation scores by testing whether ranked carbon contexts improve strain or community design decisions.
Derived Product Readiness Burn-Down
Review candidate and promoted derived products to close missing consumers, artifacts, caveats, and review routes before they become default inputs.
Functional Innovation KO Atlas Reuse Test
Test whether the Functional Innovation KO Atlas helps explain pangenome, pathway, or plant-microbiome signals beyond generic annotation summaries.
Low-Confidence Collection Curation
Reduce Atlas caveat load by upgrading high-value low-confidence collection pages with schemas, reuse examples, and missing-data labels.
Research Primitives
Start with claimsMetal-specific genes remain core-enriched
Metal-specific genes are functionally distinct from general stress genes but remain enriched in the core genome.
Lab fitness can predict field ecology
Several projects suggest that lab-measured fitness signals can align with environmental abundance or isolation context when validation data are available.
AMR mechanism composition is environment-structured
AMR mechanisms differ by environment, making resistance ecology a field and context problem rather than only a clinical annotation problem.
Ecotype analyses need rigor gates before translation
Ecotype-derived target lists can collapse under leakage, confound, and independent-evidence checks, so translation requires explicit gates.
Lanthanide-dependent methylotrophy is widespread and soil-linked
XoxF markers are far more common than canonical MxaF markers across the BERDL pangenome, with strong soil/sediment enrichment and important marker-calibration caveats.
Pangenome openness shapes functional opportunity
Pangenome openness and gene-content class affect which functions are stable, variable, mobile, or available for niche adaptation.
Prophage density predicts AMR repertoire breadth
Pangenome-scale prophage marker density is a strong species-level predictor of AMR breadth, while gene-level AMR-prophage proximity is weaker and threshold-sensitive.
Metal type diversity predicts ecological niche breadth
Genus-level metal resistance type diversity predicts broader ecological niche breadth after phylogenetic control, while total AMR burden is less informative.
Gene targets for critical-mineral bioprocessing
Use metal-specific fitness signals, annotations, modules, and structures to prioritize genes for bioleaching and biorecovery.
Metal-AMR co-selection at contaminated sites
Test whether metal contamination selects for AMR genes, mechanisms, or support networks across DOE-relevant environments.
Rare-earth gene discovery via cross-metal inference
Use cross-metal response structure to rank candidates for first rare-earth-element fitness experiments.
Fitness-validated community design
Design microbial communities using tolerance, metabolic capability, measured dependency, and risk annotations.