Research Ideas
Track proposed, in-progress, and completed research directions. 53 ideas waiting to be explored.
Proposed
15Ecotype Functional Differentiation
Do ecotypes within a species differ in their COG functional profiles?
Lifestyle-Based COG Stratification
How does lifestyle (free-living vs host-associated) affect pangenome functional composition?
Openness vs Functional Composition
Do "open" vs "closed" pangenomes show different COG enrichment patterns?
Scale to 100-200 Species
Do COG enrichment patterns hold at larger scale? Are there phylum-specific deviations?
Gene Copy Number Variation
Beyond presence/absence, do adaptive vs housekeeping genes show different copy number patterns?
Phylum-Specific Patterns Deep Dive
Which phyla deviate from universal COG enrichment patterns and why?
ANI Distance vs Ecotype Divergence
How does genomic distance (ANI) relate to functional divergence between ecotypes?
Temporal Evolution via ANI
Does COG enrichment pattern change with evolutionary distance?
Composite COG Function Networks
Do multi-functional genes (composite COGs like "LV", "EGP") represent functional modules?
Environmental Context of Core Gene Trade-offs
Can we connect the lab-measured trade-offs to natural environment data? Do organisms from more varia...
The 48 Accessory Modules
What are the 48 co-regulated gene modules that are <50% core? Are they mobile elements, niche-specif...
Plasmid vs Chromosomal Gene Functional Profiles
Do plasmid-borne genes show different COG profiles than chromosomal genes?
What Metabolic Functions Does the Cultured Collection Miss at Oak Ridge?
At Oak Ridge ENIGMA SFA, what metabolic functions does BERDL's cultured-isolate genome collection sy...
What Accessory Gene Content Distinguishes Deep-Clay Bacillota_B from Soil Congeners?
Within Bacillota_B (Desulfosporosinus, BRH-c8a Peptococcaceae, BRH-c4a Desulfotomaculales, etc.), wh...
(COMPLETED 2026-05-01)
In Progress
4Truly Dark Genes — What Remains Unknown After Modern Annotation?
Among the ~6,400 FB genes that remain hypothetical even after bakta v1.12.0 reannotation, what disti...
Pangenome Openness, Metabolic Pathways, and Biogeography
Do pangenome characteristics (open vs. closed) correlate with metabolic pathway diversity and biogeo...
Pan-bacterial Fitness Modules via ICA
Can robust ICA decomposition of RB-TnSeq fitness compendia reveal conserved functional modules acros...
Metabolic Capability vs Metabolic Dependency
Just because a bacterium's genome encodes a complete amino acid biosynthesis or carbon utilization p...
Completed
34Cross-Organism Essential Gene Families
Using FB's
ortholog table, identify essential gene families conserved across multiple species. Are...The 5,526 "Costly + Dispensable" Genes
AlphaFold MSA Depth as a Lens on the Bacterial Annotation Gap
Lignin Enrichment and Ecological Memory in Microbial Communities
ENIGMA Carbon Census — A Tiered Knowledge Census of 83 Enrichment Compounds
Regulatory and Proteomic Architecture of Δ*fur*-Permitted Lipid A Loss in *Caulobacter crescentus*
BERDL Data Atlas — Inventory, Topic Map, and Cross-Reference Synergies
Harvard Forest Long-Term Warming — DNA vs RNA Functional Response
Lanthanide Methylotrophy Atlas — Distribution and Environmental Context of REE-Dependent Methanol Oxidation Across 293K Genomes
Self-Sufficiency, Anaerobic Toolkit, and Cultivation Bias in Clay-Confined Cultured Bacterial Genomes
Metagenome-Prioritized Phage Cocktails for Crohn's Disease and IBD
Plant Microbiome Ecotype Functional Classification
PGP Gene Distribution Across Environments & Pangenomes
BacDive Phenotype Signatures of Metal Tolerance
Community Metabolic Ecology via NMDC × Pangenome Integration
Lab Fitness Predicts Field Ecology at Oak Ridge
Field vs Lab Gene Importance in DvH
Essential Gene Conservation Analysis
Quantitative Fitness Effects vs Conservation
Fitness Modules × Pangenome Conservation
Core Gene Paradox — Why Are Core Genes More Burdensome?
Co-fitness Predicts Co-inheritance
Cross-Project Synthesis
AlphaEarth Embeddings, Geography & Environment
Ecotype Reanalysis — Environmental-Only Samples
ADP1 Triple Essentiality Concordance
ADP1 Deletion Collection Phenotype Analysis
Aromatic Catabolism Support Network in ADP1
Condition-Specific Respiratory Chain Wiring in ADP1
Counter Ion Effects on Metal Fitness Measurements
Metabolic Consistency of Pseudomonas FW300-N2E3
Genotype × Condition → Phenotype Prediction from ENIGMA Growth Curves
SSO Subsurface Community Ecology — Contamination Plume Model
Prophage-AMR Co-mobilization Atlas
At pangenome scale (293K genomes, 27K species), are antibiotic resistance genes preferentially locat...