Historical Science Methodology: Applying Natural Science Logic to History

The Problem: How to Science History

History appears unscientific. You can't run experiments: can't rerun 1500 CE with different variables to test causation. You can't control variables: can't hold geography constant while varying culture. You can't falsify claims rigorously: if your explanation doesn't work, you can always add qualifications. This makes history seem trapped in narrative logic: describing what happened, not explaining why. But several sciences face identical constraints. Evolutionary biology can't run experiments on evolution; you can't rerun the Cambrian explosion to test hypotheses. Geology can't experiment on continental drift; you can't cause continents to collide on demand. Astronomy can't experiment on stars; you can't trigger supernovae to test models. Yet evolutionary biology, geology, and astronomy are indisputably scientific. Their method: treat the universe as an experiment in progress. Observe natural variation across cases. Identify variables that differ between cases. Show how variables correlate with outcomes. If the same outcome appears wherever the same variable is present, that's evidence for causation. This is the method Diamond applies to history: treating history like evolutionary biology—observing natural cases instead of running experiments.¹

Definition: Natural Experiments and Pattern Recognition

Historical Science as Experimental Observation

Experimental science runs controlled experiments: vary one factor, hold others constant, measure outcome. You can do this with chemistry (mix compounds, measure reaction) but not with history (can't rerun continents). Natural science observes patterns across naturally-varying cases instead. Evolution is a natural experiment: organisms vary, environments vary, outcomes (survival, reproduction) vary. By observing how variation in one dimension (trait) correlates with variation in another (survival), you can infer natural selection without running experiments. Historical science applies this: cases vary (continents differ geographically), outcomes vary (civilizations differ in technology), so you observe how variation in geography correlates with variation in outcomes.

Pattern Recognition as Evidence

In experimental science, evidence is a replicated experiment (you run test 100 times, get same result 100 times). In natural science, evidence is a consistent pattern across cases (you observe situation A and outcome X in multiple independent cases; observe situation B and outcome Y in multiple independent cases; conclude situation predicts outcome). The pattern is only as strong as the number of independent cases. Five domestication centers isn't 100 experiments, but it's five independent instances of the same process (domestication → states). Consistency across all five suggests the pattern isn't accident.¹

The Role of Mechanism

Observing a pattern isn't proof of causation—correlation could be coincidence or caused by a confounding variable. Science addresses this by proposing a mechanism: a causal story explaining how the pattern works. For natural selection: individuals with advantageous traits survive more often and reproduce more often, so advantageous alleles increase in frequency. This mechanism explains why trait variation would correlate with survival variation. For historical science: domesticable animals enable agriculture → agriculture enables surplus → surplus enables specialization → specialization enables states. This mechanism explains why animal availability would correlate with state formation. Mechanism plus consistent pattern across cases makes causal inference stronger.

Evidence: How Historical Science Works in Practice

Case 1: Plate Tectonics—The Method Itself

Plate tectonics can't be experimentally tested (can't make continents collide), yet it's accepted scientific theory. The evidence was: observation of continental matching (South America's coast matches Africa's), fossil distribution (same species appear on continents now separated), magnetic anomalies in ocean floor (predict seafloor spreading), paleomagnetic data (supports continental drift). None of these observations individually proves plate tectonics, but together they build a consistent pattern that no competing hypothesis explains as well. Mechanism (convection currents in mantle driving plates) plus pattern plus parsimony (the theory explains multiple phenomena simply) make it scientific despite no direct experiment.¹

The lesson for history: Diamond uses the same method. Multiple independent lines of evidence (domestication timing, animal availability, axes, disease) all correlate with historical outcomes. No single observation proves causation, but together they build a consistent pattern supporting environmental determinism. This is historical science methodology in practice.

Case 2: Evolution by Natural Selection

Evolution can't be directly observed in real time (too slow), yet it's accepted scientific theory. Evidence: fossil record (shows anatomical change over time), comparative anatomy (similar structures across species suggests common descent), biogeography (species distribution matches continental history), molecular genetics (DNA similarity matches morphological similarity). Again, no single observation proves evolution, but consistent pattern across independent lines of evidence makes it scientific. The method: show a pattern (species change over time) that appears across multiple independent cases (fossil records from different continents, different time periods), propose a mechanism (natural selection) that explains why the pattern would occur, and demonstrate that the mechanism makes testable predictions (if natural selection is correct, we should see trait variation within populations, and we do).¹

Case 3: Historical Diamond—Environmental Determinism

Diamond applies the same method:

Pattern: Continents with domesticable animals developed state-level societies; continents without did not.

Independent cases: Five domestication centers developed independently (Fertile Crescent, China, Mesoamerica, Andes, Sub-Saharan Africa).

Consistent observation: All cases with abundant domesticable animals developed states. All cases without developed more slowly or remained at chiefdom level.

Mechanism: Animals → agriculture → surplus → specialization → hierarchy → states.

Predictions:

Continents with east-west axes should show faster technology diffusion than north-south (testable: compare crop diffusion rates—confirmed)
Populations with 10,000+ years disease exposure should show genetic disease resistance, naive populations should show no resistance (testable: compare allele frequencies—confirmed)
Islands with larger resource bases should support higher population density (testable: compare Polynesian island populations—confirmed)

The pattern holds. The mechanism explains why. The predictions are verified. This is how history becomes science: showing that a proposed causal variable predicts outcomes across multiple independent cases, with mechanism explaining why, and predictions being verified.

Tensions: Is Historical Science Really Scientific?

Tension 1: Natural Experiments vs. Designed Experiments

Designed experiments (laboratory) let you isolate variables perfectly. Natural experiments (observing cases) introduce confounds. You can't control for every possible variable when comparing continents. Maybe geography doesn't determine outcomes—maybe some third variable (not yet identified) determines both geography and outcomes, making them correlate without causation. How certain can you be when you can't run controlled experiments?

The answer: you can't achieve laboratory-level certainty. But you can achieve better than random guessing. By showing consistent patterns across multiple independent cases and proposing plausible mechanisms, you narrow the field of possible explanations. This is how all natural science works: it's probabilistic, not certain, because the universe doesn't run controlled experiments.

Tension 2: Pattern vs. Exception

Historical science identifies patterns (domesticable animals → states), but patterns have exceptions (Japan rejected firearms despite having access; Inca built states without horses). Does an exception disprove the pattern? In laboratory science, one contradictory result can invalidate a hypothesis. In natural science, exceptions are expected (patterns are probabilistic, not absolute). The question becomes: does the pattern hold often enough to be meaningful? If 90% of continents with domesticable animals develop states, and 10% don't due to exceptions, the pattern is still predictive. But how do you set the threshold for acceptance?¹

Tension 3: Explanation vs. Description

Some philosophers argue that natural science (observing patterns) is "description" while experimental science (running experiments) is "explanation." Diamond describes patterns in history; he doesn't explain them by creating causal conditions and observing outcomes. Are the patterns he identifies real explanations or just sophisticated descriptions? The answer hinges on whether mechanism plus pattern across cases counts as explanation. Most scientists say yes—if you can describe a mechanism that would produce the observed pattern, and the pattern holds across independent cases, you have explanation. But the philosophical debate persists.

Author Tensions & Convergences

Diamond's use of natural science methodology is innovative but faces criticism: historians argue he oversimplifies by ignoring culture and individual agency; scientists argue he doesn't achieve true experimental rigor. Diamond's response (implicit in the book) is that perfect rigor is impossible in history, so the goal is better-than-narrative explanation. By identifying causal variables that predict outcomes across cases and proposing plausible mechanisms, he's doing the best history can do. This is a modest but defensible position: not claiming certainty, but claiming causal insight superior to narrative description.¹

Cross-Domain Handshakes

Philosophy of Science: Demarcation and Scientific Method

Demarcation Problem and Scientific Method — Philosophy of science struggles with demarcation: what counts as science vs. non-science? Positivists said science requires controlled experiments; if you can't experiment, you can't do science. This would exclude evolutionary biology, geology, and astronomy. Popper refined this: science requires falsifiability and testing against evidence, not necessarily experiments. You can test theories by observing nature (observing many cases, checking if predictions hold). This opens space for natural science: comparing cases, identifying patterns, testing predictions. Diamond's historical science fits this refined definition: proposing falsifiable claims (if domesticable animals determine development, all continents with animals should show states), testing predictions against evidence (showing the pattern holds), and revising if evidence contradicts. The insight that transfers: science is defined by method (falsifiability, testing against evidence, pattern identification), not by tool (whether you use experiments or observation). History can be scientific if historians identify testable claims and check them against evidence across cases.

Evolutionary Biology: Natural Selection as Natural Experiment

Natural Selection in Human Populations — Evolution is the canonical natural science operating without experiments. Evolutionists can't make mutations happen or create selection pressures; they observe variation (traits differ), environments (selection pressures vary), and outcomes (some traits persist, others vanish). From repeated observation across millions of cases (individuals) and millions of years (time), a pattern emerges: trait variation correlates with survival variation, which is explained by natural selection. The method is identical to Diamond's: identify a variable (trait variation), observe its correlation with outcome (survival), propose a mechanism (differential reproduction), and test predictions (if natural selection is correct, populations should show adaptation to local environment—and they do). The insight that transfers: historical science and evolutionary science use the same methodology because both face the constraint of not being able to run controlled experiments. Both observe natural variation, identify patterns, propose mechanisms, and test predictions. The disciplines differ in subject matter (history vs. biology) but not in scientific method.

The Live Edge

The Sharpest Implication

If historical science methodology is valid, then history is a testable discipline—historians can make falsifiable claims and test them against evidence. This means historians who claim causation (civilization X developed because of Y) should be able to specify: what would I observe if I were wrong? If domesticable animals cause state formation, then continents without domesticable animals should show low state formation (or clear obstacles to it). If this prediction is false—if continents without domesticable animals spontaneously develop states—then the theory is wrong. By proposing falsifiable claims, historians become accountable to evidence in a way narrative storytelling isn't. This is simultaneously liberation and constraint: liberation because history becomes genuinely scientific rather than descriptive; constraint because falsifiability means you can be wrong in ways that narrative description cannot be. The uncomfortable implication: history-as-narrative is more flexible than history-as-science because science requires accountability to evidence, but history-as-science is more powerful than narrative because it identifies causal variables that narrative obscures.

Generative Questions

Can case-comparison methods identify variables that cause outcomes at smaller scales (city-level, valley-level history) or only at continent-scale?
If historical science requires multiple independent cases to identify patterns, what historical topics have too few cases to be scientifically testable? (e.g., there's only one Egypt—can you test theories about why Egypt developed as it did?)
Does the existence of exceptions to a pattern undermine the pattern's explanatory power? If some continents without domesticable animals develop complex societies, does that refute environmental determinism?

Connected Concepts

Proximate vs. Ultimate Causation — the framework enabling falsifiable historical claims
Continental Case Study Framework — the specific application of historical science to history
Natural Selection in Human Populations — parallel natural science methodology

Open Questions

Is there a meaningful difference between "history as natural science" and "history as correlation-hunting"? When does identifying a correlation become explaining a cause?
Can historians ever achieve the predictive power that evolutionary biology has? (Evolution can predict that populations exposed to selection pressure will evolve resistance; can history predict that continents with domesticable animals will develop states?)
What happens to historical science when variables interact? If domesticable animals matter and continental axes matter and disease matters, how do you weight their relative contributions?

Historical Science Methodology: Applying Natural Science Logic to History

Historical Science Methodology: Applying Natural Science Logic to History

Historical Science Methodology: Applying Natural Science Logic to History

The Problem: How to Science History

Definition: Natural Experiments and Pattern Recognition

Evidence: How Historical Science Works in Practice

Case 1: Plate Tectonics—The Method Itself

Case 2: Evolution by Natural Selection

Case 3: Historical Diamond—Environmental Determinism

Tensions: Is Historical Science Really Scientific?

Author Tensions & Convergences

Cross-Domain Handshakes

Philosophy of Science: Demarcation and Scientific Method

Evolutionary Biology: Natural Selection as Natural Experiment

The Live Edge

Connected Concepts

Open Questions

Footnotes