AI & ML Advanced By Samson Tanimawo, PhD Published Dec 25, 2026 5 min read

AI for Scientific Discovery

AlphaFold, AlphaProof, GNoME. AI is now a tool in scientific workflows, not just demos. Here is what works and what the next frontier looks like.

What it has actually delivered

By 2026, AI has produced specific scientific contributions: AlphaFold's protein structure predictions; novel material discoveries (GNoME at Google); accelerated drug discovery hit-finding; mathematical theorem proving (DeepMind's AlphaProof). The contributions are real; the framing of "AI will solve science" is overhyped, AI augments scientists; doesn't replace them.

The AlphaFold case. Predicted structures for ~200M proteins. Used in tens of thousands of biomedical research projects. Has accelerated drug discovery, structural biology, basic biological understanding. Probably the highest-impact AI-for-science contribution to date.

The GNoME case. Discovered ~2 million new stable inorganic materials. Some are useful for batteries, semiconductors, catalysts. Many are research curiosities. The volume of discoveries is real; their specific commercial impact will play out over decades.

The AlphaProof case. AI solved International Mathematical Olympiad problems at silver-medal level. Demonstrates real mathematical reasoning. The transfer to research-level math is partial; AI has helped formalise some proofs but hasn't yet produced novel research-level results independently.

The honest framing. AI-for-science delivers in narrow domains where data is abundant and outcomes are verifiable. It augments scientists; rarely replaces. The "AI will cure cancer / solve aging / discover new physics autonomously" framing is too aggressive; "AI accelerates research that was already happening" is more accurate.

The arch pattern

Successful AI-for-science systems have a common pattern: large pretrained model + domain-specific fine-tuning + verifier loop. The verifier is what closes the loop, predicted protein structure verified by physical simulation; predicted material verified by DFT calculation; mathematical theorem verified by formal proof checker. AI proposes; verifier disposes.

The pretrained-model layer. Foundation models (transformers, GNNs, equivariant networks) trained on broad domain data. The pretraining provides inductive biases. AlphaFold uses transformers on multiple sequence alignments; GNoME uses GNN on crystal structures. Different domains, similar architectural family.

The domain-fine-tune layer. Specialise the foundation model for the specific scientific question. AlphaFold fine-tunes on protein-specific tasks. The fine-tune brings domain expertise into the model.

The verifier loop. The model proposes solutions. The verifier checks them. Bad proposals are rejected; good ones are kept. Over many iterations, the model improves its proposals. The verifier is the quality control that prevents the model from generating plausible-but-wrong results.

The data-flywheel. Each verified solution becomes training data for the next iteration. The system improves over time. AlphaFold's later versions trained on its earlier predictions, refining structure quality. The flywheel is what makes domain-specific systems improve fast.

Limits

AI-for-science doesn't yet do paradigm-shifting science. It optimises within existing frameworks; it doesn't propose new frameworks. The "AI Einstein", a system that produces a new conceptual breakthrough comparable to relativity or quantum mechanics, is not on the near horizon. Current systems are powerful within paradigms, not paradigm-breakers.

The within-paradigm vs paradigm-shift distinction. Within-paradigm work: better proteins, better materials, better algorithms. Paradigm shift: a new theory of physics, a new biology framework. AI excels at the first; the second is human work.

The novelty problem. AI generally produces variations of training data. Truly novel concepts (not interpolations of existing concepts) are hard. Some research domains (theoretical physics, conceptual mathematics) reward novelty more than others (drug discovery, materials).

The verification bottleneck. AI's accelerated proposal generation is bottlenecked by verification. Predicting a new material is fast; synthesising and characterising it in the lab takes months. The end-to-end speed is limited by the slowest stage.

The reproducibility issue. AI-for-science contributions sometimes don't reproduce. Sample-selection effects, evaluation-leakage, overfitting to specific benchmarks. The peer-review and replication culture is still adapting; expect a winnowing of overhyped results over the next 3-5 years.

The interpretability gap. AI may discover patterns it can't explain. A material works because of some pattern; the explanation is hidden in network weights. Scientists eventually want explanations, not just predictions. Interpretability research helps; isn't fully solved.

Next frontier

Direction 1: agents that conduct experiments. Read literature, propose hypotheses, design experiments, execute them (in simulation or via lab automation), interpret results, iterate. Some early work; production deployments emerging in 2026-2027. The "AI research scientist" framing.

Direction 2: integration with experimental robotics. Cloud labs (Strateos, Emerald) execute experiments via API. AI plans experiments; lab robots execute; results feed back. Closed-loop discovery without human bottlenecks for routine experiments. Production deployments accelerating.

Direction 3: theorem proving and formal verification. AI helping to formalise mathematics, prove theorems, verify software correctness. AlphaProof and follow-ons demonstrate progress. The combination of natural-language math reasoning with formal verification is producing genuinely useful tools.

Direction 4: cross-disciplinary connections. AI as a tool that finds connections humans miss between disparate fields. Patent-data analysis surfaces unknown technical relationships; literature mining finds hypotheses spanning multiple disciplines. Productivity tool for researchers.

Direction 5: simulation acceleration. Replace expensive simulations (DFT, molecular dynamics, climate models) with neural-network surrogates that are 100-10000x faster. Each accelerated simulation unlocks scientific work that wasn't feasible. The research-velocity multiplier is substantial.

The 5-10 year outlook. By 2030-2035, AI is likely a routine tool in most science labs (it's already starting). Some discoveries will be primarily AI-driven. The character of science changes, more proposals, more verification, more cycling, without science losing its human core.

Common antipatterns

Promising AI-driven cures and solutions on aggressive timelines. The hype gap to reality is large. Be careful with timelines.

Skipping verification. Verification is what makes AI-science work. Without it, you produce confident-sounding but wrong predictions.

Measuring by benchmark only. Real-world reproducibility matters more. Track verified contributions, not benchmark numbers.

Treating AI-for-science as a separate field. The integration with traditional methods is key. AI is a tool; combine with established methods.

What to do this week

Three moves. (1) If you're a scientist, evaluate one AI tool relevant to your domain (AlphaFold for biology, MatBench for materials, etc.). Hands-on experience reveals capabilities and limits. (2) If you're building AI-for-science tools, identify the verifier explicitly. The verifier is the difference between useful and hallucinated. (3) Track AI-for-science contributions over the next year. The genuinely novel ones (verified, reproduced, used) will distinguish themselves from hype. Following the actual contributions calibrates expectations.