Literature Synthesis at Scale: From 500 Papers to One Dossier

The Synthesis Problem

Reading 500 papers and forming a coherent view of a field is cognitively demanding in a way that is hard to parallelise. You can split the reading across a team, but integrating 20 people’s notes into a coherent synthesis is its own problem.

assay.it approaches this differently: instead of summarising each paper independently, it builds a shared evidence structure across the entire corpus, then synthesises from that structure.

Building the Evidence Structure

The process has three phases.

Phase 1 — Entity Resolution

All entities (concepts, compounds, methods, organisms, institutions) are resolved to canonical identifiers. “mTOR inhibitor”, “rapamycin”, and “sirolimus” are recognised as referring to the same class of compounds. This allows claims about the same thing to be aggregated regardless of terminology.

Phase 2 — Claim Aggregation

All claims about each entity pair are collected and grouped by type (causal, correlational, definitional). Claims with compatible semantics are merged; conflicting claims are flagged.

Phase 3 — Evidence Grading

The aggregated evidence is graded by study design, sample size, replication count, and recency.

What the Output Looks Like

The final dossier contains:

Evidence map — a network of entities and the evidence-weighted relationships between them
Consensus view — a narrative synthesis of the strongest evidence
Controversy log — a list of claims where the evidence is conflicting or insufficient
Open questions — gaps where the corpus contains no direct evidence

Practical Tips

Start with a focused research question; broad queries produce large corpora that take longer to synthesise
Use the entity resolution step to align terminology before synthesis
Always review the controversy log — disagreements in the literature are often the most interesting findings