Can Flux Balance Analysis Predict Antibiotic Synergy? 107,296 Simulations Say No — and What to Do Instead
A companion to two ESKAPE preprints: standard LP-based flux balance analysis structurally cannot detect synergy between essential gene pairs (107,296 simulations across 3 pathogens, zero synergy found). The fix is to stop using FBA as a synergy calculator and use it as a feature generator — partial-inhibition simulations plus ML to curate drug targets, fully open-source.
Companion to two preprints (Research Square, v1, not yet peer-reviewed): the negative result — Why LP-Based FBA Cannot Detect Synergy for Essential Gene Pairs (rs-9398278) — and the constructive method — Continuous-Valued Training Data from Genome-Scale Metabolic Models (rs-9374605). Code (open source, MIT): github.com/shoo99/ai-drug-target. This is computational, hypothesis-generating work — not a clinical or production tool.
TL;DR (Quick Answer)
A popular idea in computational drug discovery is to use flux balance analysis (FBA) on genome-scale metabolic models to predict which antibiotic combinations will act synergistically. We tested that idea exhaustively, and it doesn't hold:
- The negative result. Across 107,296 partial-inhibition simulations on three ESKAPE-pathogen metabolic models, standard LP-based FBA detected exactly zero synergistic essential-gene pairs. This isn't a tuning problem — all 39 essential genes show step-function dose-response with α* = 1.00, which is a mathematically predictable consequence of linear-programming sensitivity analysis. FBA-as-a-synergy-calculator is structurally incapable of the task for essential genes.
- The reframe that works. Use FBA as a feature generator, not a synergy calculator. Feeding FBA-derived metabolic features plus pathway identity into a classifier gave a preliminary signal (F1 = 0.847, AUROC = 0.804 under gene-level cross-validation) — above chance, but explicitly not clinical predictive power.
- A method to make that honest. Replace binary knockouts with partial gene inhibition (10–100% flux reduction) to generate 945 continuous-valued training targets, feed them to compact neural nets, and integrate a knowledge graph, local-LLM literature mining, and AlphaFold to curate 29 ESKAPE drug targets that lack approved therapeutics.
- Everything is open. Full pipeline, a Streamlit dashboard, a 40-test suite, and reproducible figures — MIT-licensed at shoo99/ai-drug-target.
The throughline is epistemic: a model that confirms what you already assume isn't evidence. Knowing exactly why FBA fails here is what tells you how to use it correctly.
Background: the AMR clock, and the appeal of FBA
Antimicrobial resistance is projected to cause ~10 million deaths annually by 2050, and antibiotic combinations are one of the few levers against it. Screening every pair experimentally scales as O(n²), so a computational pre-filter is attractive — and FBA is the obvious candidate, because genome-scale metabolic models (GEMs) already predict single-gene essentiality well.
The leap people make is from "FBA predicts essentiality" to "FBA can predict whether knocking out gene A and gene B is synergistic." That leap is where things break.
The experiment: 107,296 simulations, three pathogens
We took three curated GEMs — iML1515 (E. coli K-12 MG1655; 2,712 reactions, 1,516 genes), iYS1720 (S. aureus USA300; 3,357 reactions, 1,707 genes), and iYL1228 (K. pneumoniae MGH 78578; 1,228 genes) — and, for all 39 curated essential genes, swept pairwise inhibition from 0% to 100% in 10% steps, scoring each grid cell for Bliss-independence synergy. That's 107,296 partial-inhibition simulations.
The result was zero synergistic pairs, in every model. And not noisily zero — cleanly, structurally zero. Every essential gene produced a step-function response: growth is flat until inhibition crosses a threshold, then collapses, with a synergy parameter α* = 1.00 across the board.
Why FBA cannot find synergy here (it's the math, not the biology)
This is the part worth internalizing. FBA solves a linear program: maximize biomass flux subject to stoichiometric and capacity constraints. For an essential gene, partial inhibition does nothing to the optimum until the constraint actually binds, at which point growth drops — a single breakpoint. Combining two such genes under LP gives you the more binding of two breakpoints, not a multiplicative interaction. There is no mechanism in standard LP-FBA for two partial inhibitions to interact super-additively. So "FBA found synergy" in prior work is, for essential genes, largely re-confirming essentiality, not discovering an interaction.
The clean way to say it (and the papers' central distinction): FBA as a synergy calculator fails; FBA as a feature generator might help. Those are different jobs.
The honest ML proof-of-concept (and why the caveats are the point)
As a feature generator, FBA can supply metabolic context to a supervised model. On a small curated set of 45 antibiotic combinations (28 synergistic, 10 antagonistic, 7 additive; 17 target genes, 11 pathways; Bliss scores literature-derived, not measured here), a gradient-boosting classifier using pathway identity + FBA features reached F1 = 0.847, AUROC = 0.804 under gene-level GroupKFold cross-validation (so combinations sharing a gene can't leak between train and test). Permutation testing put it above chance (z = 2.58, p < 0.001).
What makes this credible is that the paper then argues against its own result:
- Pathway identity carried more weight than the FBA features (pathway-only F1 = 0.828 vs. FBA-only F1 = 0.807) — so FBA isn't doing most of the work.
- Remove ribosome-targeting combinations and AUROC falls to 0.627 (near random) — the signal is largely one pathway subgroup, not a general synergy rule.
- 71% of combinations had at least one gene unmapped in iML1515, and a "gene-mapping-status" artifact ranked as the second most important feature — i.e., the model partly exploited a data-coverage artifact.
- And the explicit caveat: above-chance signal in a small, biased n = 45 set "does not imply clinical predictive power."
This is what a useful negative-leaning result looks like — reported with the knife pointed at itself.
The constructive half: continuous training data from partial inhibition
The companion method paper turns the "FBA as feature generator" idea into a reusable pipeline that fixes three things people usually skip: knockout-only FBA gives binary phenotypes (useless for regression), there are no gene-level toxicity datasets, and pipelines rarely report negative validation.
- Partial-inhibition sampling. Applying 10–100% flux reduction to a mixed set of essential and non-essential genes (39 + 30 of iML1515's 1,516 genes) yields 945 continuous-valued FBA simulations to use as regression targets — explicitly framed as training targets, not drug-response predictions.
- Compact networks. A subsystem-structured ANN that mirrors metabolic organization cuts parameters by 61.5% versus a fully-connected baseline, and a dual-head ANN jointly regresses potency and toxicity.
- An honest toxicity heuristic. With no labeled toxicity data, a multi-evidence score (sequence homology 35%, pathway overlap 30%, conservation 20%, cross-reactivity 15%) — with the weights flagged as an initial proposal pending experimental calibration, not a validated model.
- Four-way integration. A Neo4j knowledge graph, local-LLM literature mining (46% effective precision — reported, not hidden), and AlphaFold structural analysis.
Applied across the three ESKAPE GEMs, the pipeline curates 29 targets that lack approved therapeutics. A sequence-homology audit found 11 of 21 assessed targets have no detectable human homolog (good for selectivity), while folA shows ~30% identity to human DHFR2 — consistent with the known trimethoprim cross-reactivity, a built-in sanity check that the audit behaves.
Why this matters
- For modelers: before you read synergy out of an FBA combination screen, ask whether LP can even represent the interaction. For essential genes, it can't — you're measuring essentiality twice.
- For ML-on-biology: the value here isn't a leaderboard number; it's the discipline — gene-level splits to stop leakage, ablations that expose where the signal really comes from, and naming the data artifact your model latched onto.
- For the field's hygiene: publishing the negative result and the negative validation is the contribution. Continuous-valued partial-inhibition data and the compact ANNs are reusable parts, not a finished drug-discovery oracle.
Honest Limitations
- Preprints, not peer-reviewed. Provisional.
- Synergy labels are literature-derived, from heterogeneous conditions — not measured in this work.
- The ML signal is small and biased (n = 45), pathway-driven, and partly artifact-driven; it is not evidence of clinical synergy prediction.
- The toxicity heuristic is unvalidated — its weights await experimental calibration.
- FBA-derived targets are hypotheses. Curation narrows a list; it does not confirm a drug.
FAQ
Q: Does this mean FBA is useless for drug discovery?
No. It means FBA-as-a-synergy-calculator fails for essential gene pairs (a structural LP limitation, shown across 107,296 simulations). FBA-as-a-feature-generator — supplying metabolic context to ML and to target curation — is still useful. The two roles are different.
Q: Why can't linear-programming FBA capture synergy?
For an essential gene, partial inhibition has no effect until the constraint binds, giving a step-function with a single breakpoint. Combining two such genes under LP yields the more-binding breakpoint, not a super-additive interaction — so there's no mechanism for synergy to emerge. Capturing it would need non-LP formulations (e.g., kinetic or regulatory models).
Q: Is the ML model ready to pick antibiotic combinations?
No. It shows above-chance signal on 45 literature-curated combinations, but ablations show the signal is largely one pathway subgroup plus a data artifact. Treat it as a proof-of-concept and a cautionary tale, not a predictor.
Q: Can I reproduce all of this?
Yes — the complete platform (pipelines, a 10-tab Streamlit dashboard, a 40-test suite, and figure-regeneration scripts) is MIT-licensed at github.com/shoo99/ai-drug-target. The metabolic models come from BiGG Models.
Resources
- Preprint 1 — the negative result (rs-9398278): https://doi.org/10.21203/rs.3.rs-9398278/v1
- Preprint 2 — the method (rs-9374605): https://doi.org/10.21203/rs.3.rs-9374605/v1
- Code (open source, MIT): github.com/shoo99/ai-drug-target
- Metabolic models: BiGG Models (iML1515, iYS1720, iYL1228)
- Related on this site: Building ARIA, an LLM-driven RNA-seq framework
관련 글
Machine Learning in Systems Biology: Methods, Applications, and Best Practices
2월 11일 · 8 min read
AI/MLAI-Powered Biomarker Discovery: From Data to Clinical Application
2월 5일 · 8 min read
ProteomicsReproducing Park et al. 2026: Three Iterations of a Cross-Species ECM Proteomics Pipeline
5월 19일 · 13 min read
Network BiologyPPI Network Construction and Hub Protein Analysis: A Practical Guide for Researchers
5월 4일 · 12 min read