Can Flux Balance Analysis Predict Antibiotic Synergy? 107,296 Simulations Say No — and What to Do Instead

Q: Does this mean FBA is useless for drug discovery?

No. It means FBA-as-a-*synergy-calculator* fails for essential gene pairs (a structural LP limitation, shown across 107,296 simulations). FBA-as-a-*feature-generator* — supplying metabolic context to ML and to target curation — is still useful. The two roles are different.

Q: Why can't linear-programming FBA capture synergy?

For an essential gene, partial inhibition has no effect until the constraint binds, giving a step-function with a single breakpoint. Combining two such genes under LP yields the more-binding breakpoint, not a super-additive interaction — so there's no mechanism for synergy to emerge. Capturing it would need non-LP formulations (e.g., kinetic or regulatory models).

Q: Is the ML model ready to pick antibiotic combinations?

No. It shows above-chance signal on 45 literature-curated combinations, but ablations show the signal is largely one pathway subgroup plus a data artifact. Treat it as a proof-of-concept and a cautionary tale, not a predictor.

Q: Can I reproduce all of this?

Yes — the complete platform (pipelines, a 10-tab Streamlit dashboard, a 40-test suite, and figure-regeneration scripts) is MIT-licensed at github.com/shoo99/ai-drug-target. The metabolic models come from BiGG Models.

Modeling metabolism to find antibiotic targets

Companion to two preprints (Research Square, v1, not yet peer-reviewed): the negative result — Why LP-Based FBA Cannot Detect Synergy for Essential Gene Pairs (rs-9398278) — and the constructive method — Continuous-Valued Training Data from Genome-Scale Metabolic Models (rs-9374605). Code (open source, MIT): github.com/shoo99/ai-drug-target. This is computational, hypothesis-generating work — not a clinical or production tool.

TL;DR (Quick Answer)

A popular idea in computational drug discovery is to use flux balance analysis (FBA) on genome-scale metabolic models to predict which antibiotic combinations will act synergistically. We tested that idea exhaustively, and it doesn't hold:

The negative result. Across 107,296 partial-inhibition simulations on three ESKAPE-pathogen metabolic models, standard LP-based FBA detected exactly zero synergistic essential-gene pairs. This isn't a tuning problem — all 39 essential genes show step-function dose-response with α* = 1.00, which is a mathematically predictable consequence of linear-programming sensitivity analysis. FBA-as-a-synergy-calculator is structurally incapable of the task for essential genes.
The reframe that works. Use FBA as a feature generator, not a synergy calculator. Feeding FBA-derived metabolic features plus pathway identity into a classifier gave a preliminary signal (F1 = 0.847, AUROC = 0.804 under gene-level cross-validation) — above chance, but explicitly not clinical predictive power.
A method to make that honest. Replace binary knockouts with partial gene inhibition (10–100% flux reduction) to generate 945 continuous-valued training targets, feed them to compact neural nets, and integrate a knowledge graph, local-LLM literature mining, and AlphaFold to curate 29 ESKAPE drug targets that lack approved therapeutics.
Everything is open. Full pipeline, a Streamlit dashboard, a 40-test suite, and reproducible figures — MIT-licensed at shoo99/ai-drug-target.

The throughline is epistemic: a model that confirms what you already assume isn't evidence. Knowing exactly why FBA fails here is what tells you how to use it correctly.

Background: the AMR clock, and the appeal of FBA

Antimicrobial resistance is projected to cause ~10 million deaths annually by 2050, and antibiotic combinations are one of the few levers against it. Screening every pair experimentally scales as O(n²), so a computational pre-filter is attractive — and FBA is the obvious candidate, because genome-scale metabolic models (GEMs) already predict single-gene essentiality well.

The leap people make is from "FBA predicts essentiality" to "FBA can predict whether knocking out gene A and gene B is synergistic." That leap is where things break.

The experiment: 107,296 simulations, three pathogens

We took three curated GEMs — iML1515 (E. coli K-12 MG1655; 2,712 reactions, 1,516 genes), iYS1720 (S. aureus USA300; 3,357 reactions, 1,707 genes), and iYL1228 (K. pneumoniae MGH 78578; 1,228 genes) — and, for all 39 curated essential genes, swept pairwise inhibition from 0% to 100% in 10% steps, scoring each grid cell for Bliss-independence synergy. That's 107,296 partial-inhibition simulations.

The result was zero synergistic pairs, in every model. And not noisily zero — cleanly, structurally zero. Every essential gene produced a step-function response: growth is flat until inhibition crosses a threshold, then collapses, with a synergy parameter α* = 1.00 across the board.

Why FBA cannot find synergy here (it's the math, not the biology)

This is the part worth internalizing. FBA solves a linear program: maximize biomass flux subject to stoichiometric and capacity constraints. For an essential gene, partial inhibition does nothing to the optimum until the constraint actually binds, at which point growth drops — a single breakpoint. Combining two such genes under LP gives you the more binding of two breakpoints, not a multiplicative interaction. There is no mechanism in standard LP-FBA for two partial inhibitions to interact super-additively. So "FBA found synergy" in prior work is, for essential genes, largely re-confirming essentiality, not discovering an interaction.

The clean way to say it (and the papers' central distinction): FBA as a synergy calculator fails; FBA as a feature generator might help. Those are different jobs.

The honest ML proof-of-concept (and why the caveats are the point)

As a feature generator, FBA can supply metabolic context to a supervised model. On a small curated set of 45 antibiotic combinations (28 synergistic, 10 antagonistic, 7 additive; 17 target genes, 11 pathways; Bliss scores literature-derived, not measured here), a gradient-boosting classifier using pathway identity + FBA features reached F1 = 0.847, AUROC = 0.804 under gene-level GroupKFold cross-validation (so combinations sharing a gene can't leak between train and test). Permutation testing put it above chance (z = 2.58, p < 0.001).

What makes this credible is that the paper then argues against its own result:

Pathway identity carried more weight than the FBA features (pathway-only F1 = 0.828 vs. FBA-only F1 = 0.807) — so FBA isn't doing most of the work.
Remove ribosome-targeting combinations and AUROC falls to 0.627 (near random) — the signal is largely one pathway subgroup, not a general synergy rule.
71% of combinations had at least one gene unmapped in iML1515, and a "gene-mapping-status" artifact ranked as the second most important feature — i.e., the model partly exploited a data-coverage artifact.
And the explicit caveat: above-chance signal in a small, biased n = 45 set "does not imply clinical predictive power."

This is what a useful negative-leaning result looks like — reported with the knife pointed at itself.

The constructive half: continuous training data from partial inhibition

The companion method paper turns the "FBA as feature generator" idea into a reusable pipeline that fixes three things people usually skip: knockout-only FBA gives binary phenotypes (useless for regression), there are no gene-level toxicity datasets, and pipelines rarely report negative validation.

Partial-inhibition sampling. Applying 10–100% flux reduction to a mixed set of essential and non-essential genes (39 + 30 of iML1515's 1,516 genes) yields 945 continuous-valued FBA simulations to use as regression targets — explicitly framed as training targets, not drug-response predictions.
Compact networks. A subsystem-structured ANN that mirrors metabolic organization cuts parameters by 61.5% versus a fully-connected baseline, and a dual-head ANN jointly regresses potency and toxicity.
An honest toxicity heuristic. With no labeled toxicity data, a multi-evidence score (sequence homology 35%, pathway overlap 30%, conservation 20%, cross-reactivity 15%) — with the weights flagged as an initial proposal pending experimental calibration, not a validated model.
Four-way integration. A Neo4j knowledge graph, local-LLM literature mining (46% effective precision — reported, not hidden), and AlphaFold structural analysis.

Applied across the three ESKAPE GEMs, the pipeline curates 29 targets that lack approved therapeutics. A sequence-homology audit found 11 of 21 assessed targets have no detectable human homolog (good for selectivity), while folA shows ~30% identity to human DHFR2 — consistent with the known trimethoprim cross-reactivity, a built-in sanity check that the audit behaves.

Why this matters

For modelers: before you read synergy out of an FBA combination screen, ask whether LP can even represent the interaction. For essential genes, it can't — you're measuring essentiality twice.
For ML-on-biology: the value here isn't a leaderboard number; it's the discipline — gene-level splits to stop leakage, ablations that expose where the signal really comes from, and naming the data artifact your model latched onto.
For the field's hygiene: publishing the negative result and the negative validation is the contribution. Continuous-valued partial-inhibition data and the compact ANNs are reusable parts, not a finished drug-discovery oracle.

Honest Limitations

Preprints, not peer-reviewed. Provisional.
Synergy labels are literature-derived, from heterogeneous conditions — not measured in this work.
The ML signal is small and biased (n = 45), pathway-driven, and partly artifact-driven; it is not evidence of clinical synergy prediction.
The toxicity heuristic is unvalidated — its weights await experimental calibration.
FBA-derived targets are hypotheses. Curation narrows a list; it does not confirm a drug.

FAQ

Q: Does this mean FBA is useless for drug discovery?

No. It means FBA-as-a-synergy-calculator fails for essential gene pairs (a structural LP limitation, shown across 107,296 simulations). FBA-as-a-feature-generator — supplying metabolic context to ML and to target curation — is still useful. The two roles are different.

Q: Why can't linear-programming FBA capture synergy?

For an essential gene, partial inhibition has no effect until the constraint binds, giving a step-function with a single breakpoint. Combining two such genes under LP yields the more-binding breakpoint, not a super-additive interaction — so there's no mechanism for synergy to emerge. Capturing it would need non-LP formulations (e.g., kinetic or regulatory models).

Q: Is the ML model ready to pick antibiotic combinations?

No. It shows above-chance signal on 45 literature-curated combinations, but ablations show the signal is largely one pathway subgroup plus a data artifact. Treat it as a proof-of-concept and a cautionary tale, not a predictor.

Q: Can I reproduce all of this?

Yes — the complete platform (pipelines, a 10-tab Streamlit dashboard, a 40-test suite, and figure-regeneration scripts) is MIT-licensed at github.com/shoo99/ai-drug-target. The metabolic models come from BiGG Models.

Resources

Preprint 1 — the negative result (rs-9398278): https://doi.org/10.21203/rs.3.rs-9398278/v1
Preprint 2 — the method (rs-9374605): https://doi.org/10.21203/rs.3.rs-9374605/v1
Code (open source, MIT): github.com/shoo99/ai-drug-target
Metabolic models: BiGG Models (iML1515, iYS1720, iYL1228)
Related on this site: Building ARIA, an LLM-driven RNA-seq framework

Can Flux Balance Analysis Predict Antibiotic Synergy? 107,296 Simulations Say No — and What to Do Instead

TL;DR (Quick Answer)

Background: the AMR clock, and the appeal of FBA

The experiment: 107,296 simulations, three pathogens

Why FBA cannot find synergy here (it's the math, not the biology)

The honest ML proof-of-concept (and why the caveats are the point)

The constructive half: continuous training data from partial inhibition

Why this matters

Honest Limitations

FAQ

Resources

관련 글

Machine Learning in Systems Biology: Methods, Applications, and Best Practices

AI-Powered Biomarker Discovery: From Data to Clinical Application

Reproducing Park et al. 2026: Three Iterations of a Cross-Species ECM Proteomics Pipeline

PPI Network Construction and Hub Protein Analysis: A Practical Guide for Researchers