02B H2 Hypergeometric Verdict
Jupyter notebook from the Caulobacter Fur–Lipid A Loss project.
02b — H2 hypergeometric verdict (post-review threshold recalibration)¶
Project: caulobacter_fur_lipida_loss — Phase C, NB02b. Addresses adversarial review I1.
Purpose¶
NB02 used a pre-registered "≥10% phenotype-bearing" threshold for H2. The adversarial review (ADVERSARIAL_REVIEW_1.md, I1) correctly noted that this threshold sat below the genome background rate of 33.25% — passing the threshold therefore did not demonstrate enrichment. This notebook formalizes the hypergeometric-enrichment test against the genome background as the H2 verdict.
Pre-registered NB02b verdict (added in response to adversarial review)¶
- SUPPORTED if hypergeometric p-value < 0.05 AND fold-enrichment ≥ 1.5× background.
- MARGINAL if p < 0.05 AND fold-enrichment 1.0-1.5×.
- NOT SUPPORTED if p ≥ 0.05 (regardless of fold).
This recalibration applies to BOTH Path A (concordant_strong) and Path B (SspB-buffered).
import pandas as pd
import numpy as np
from scipy.stats import hypergeom
from pathlib import Path
PROJ = Path('/home/aparkin/BERIL-research-observatory/projects/caulobacter_fur_lipida_loss')
DATA = PROJ / 'data'
# Reload scoring outputs from NB02
sA = pd.read_csv(DATA / 'NB02_pathA_concordant_strong_scoring.csv')
sB = pd.read_csv(DATA / 'NB02_pathB_buffered_scoring.csv')
# Background rate from NB02 enrichment cell — recompute here for transparency
N = 3943 # total Caulobacter genes in FB
K = 1311 # phenotype-bearing in genome (|t|>4 in ≥2 envelope/iron experiments)
background_rate = K / N
print(f'Background phenotype-bearing rate: {K}/{N} = {background_rate:.4f} (33.25%)')
def verdict_table(sub, label):
n = len(sub)
pb = int(sub['phenotype_bearing'].sum())
obs_rate = pb / n
expected = (K * n) / N
fold = (pb / expected) if expected else float('nan')
# P(X >= pb | hypergeom(N, K, n))
p = hypergeom.sf(pb - 1, N, K, n) if pb > 0 else 1.0
if p < 0.05 and fold >= 1.5:
verdict = 'SUPPORTED'
elif p < 0.05 and fold >= 1.0:
verdict = 'MARGINAL'
else:
verdict = 'NOT SUPPORTED'
return dict(set=label, set_size=n, observed_pb=pb, expected_pb=round(expected, 2),
observed_rate=round(obs_rate, 4), background_rate=round(background_rate, 4),
fold_enrichment=round(fold, 2), hypergeom_p=p, verdict=verdict)
rowA = verdict_table(sA, 'Path A (concordant_strong)')
rowB = verdict_table(sB, 'Path B (SspB-buffered)')
verdict_df = pd.DataFrame([rowA, rowB])
print('\n=== H2 PRE-REGISTERED HYPERGEOMETRIC VERDICT (NB02b recalibration) ===\n')
display(verdict_df)
verdict_df.to_csv(DATA / 'NB02b_h2_hypergeometric_verdict.csv', index=False)
Background phenotype-bearing rate: 1311/3943 = 0.3325 (33.25%) === H2 PRE-REGISTERED HYPERGEOMETRIC VERDICT (NB02b recalibration) ===
| set | set_size | observed_pb | expected_pb | observed_rate | background_rate | fold_enrichment | hypergeom_p | verdict | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | Path A (concordant_strong) | 32 | 17 | 10.64 | 0.5312 | 0.3325 | 1.60 | 0.015609 | SUPPORTED |
| 1 | Path B (SspB-buffered) | 26 | 9 | 8.64 | 0.3462 | 0.3325 | 1.04 | 0.514686 | NOT SUPPORTED |
Interpretation¶
The recalibrated H2 verdict makes the asymmetry between Path A and Path B explicit:
- Path A (concordant_strong, clean Fur signature): marginal enrichment over background (53% observed vs 33% expected; fold 1.60×, p=0.016). Supported but not overwhelming — consistent with a real but modest Fur-released subset that participates in envelope-stress fitness.
- Path B (SspB-buffered, cbb3/fix-rich): no enrichment (34.6% observed vs 33% expected; fold 1.04×, p=0.515). The fitness data do not selectively support the SspB-buffered respiratory chain as more critical than randomly drawn genes under envelope stress.
The "respiratory ATP required" arm of the dual-release switch model (REPORT Mechanistic Synthesis §2) is therefore a working hypothesis grounded in transcript-level data, not an empirically-established finding from the fitness ranking.
This recalibration also surfaces a generalizable methodological lesson recorded in the adversarial reviewer's learned-patterns: pre-registered fixed-percentage thresholds for fitness-phenotype tests in BERIL projects should be calibrated against background rather than absolute percentages.
print('Summary:')
print(f' Path A: fold={rowA["fold_enrichment"]}x, p={rowA["hypergeom_p"]:.3e}, verdict={rowA["verdict"]}')
print(f' Path B: fold={rowB["fold_enrichment"]}x, p={rowB["hypergeom_p"]:.3e}, verdict={rowB["verdict"]}')
print()
print('Saved data/NB02b_h2_hypergeometric_verdict.csv')
Summary: Path A: fold=1.6x, p=1.561e-02, verdict=SUPPORTED Path B: fold=1.04x, p=5.147e-01, verdict=NOT SUPPORTED Saved data/NB02b_h2_hypergeometric_verdict.csv