Not long before his death, W.D. Hamilton went apeshit.
Hamilton was an expositor of the “Red Queen hypothesis”, which has as a corollary that humans are in a constant state of evolutionary warfare with infectious diseases. This interest in infectious disease overlapped in time with the then raging AIDS pandemic of the 1990s and Hamilton elected to investigate the OPV hypothesis, the idea that pandemic HIV began by careless use of infected chimpanzee tissue in oral polio vaccines in the 1950s.
What’s a real scientist to do with a hypothesis but no data to test it? He goes and gets data. Hamilton did not outrun the Red Queen; he died after contracting malaria in the Congo while searching for viral precursors to HIV in chimpanzee feces he collected in service of this rather remote hypothesis.
Instead of giving a lot of background on the OPV hypothesis, I’ll just link to Jonatan Pallesen’s Twitter article from a few months ago, which I ended up engaging with a bit.
Pallensen’s article is a highly condensed summary of the plausibility of the OPV theory, and to be forthright, it took a bit of courage to post given how inflammatory the subject is; it sits in a city that’s been firebombed by information warfare for 30 years.
Like it or not, Michael Worobey, of the Huanan Seafood Market fame, is one of the heirs to Hamilton’s work, and early in his career he worked to expand Hamilton’s chimpanzee shit collection.
Given the simultaneous weakness and overconfidence of Worobey’s COVID origins contributions, here and here, his publication history needs to be scrutinized. It needs to be doubly scrutinized where using geospatial arguments to make “dispositive” claims about disease origins. It needs to be triply scrutinized when employing any type of serious data analysis to “refute” things, and so on.
To this end, the rest of this poast will serve as a reproduction and examination of Worobey et al., 2004 entitled “Contaminated polio vaccine theory refuted”.
This brief paper describes an expansion of chimp shit hunting in the jungle near(ish) to Kisangani (formerly Stanleyville). This is where Koprowski’s vaccine was allegedly contaminated with SIV-infected chimp kidneys in tissue culture. If the OPV theory is correct, Worobey argues, the staff at Koprowski’s lab there would have used locally procured chimpanzees to create culture, and their descendants’ feces would potentially harbor SIV descendants of the SIV contaminating the vaccines. This approach is similar to what Hamilton was attempting; if you can find and sequence SIV viruses in the area around Kisangani, you can phylogenetically place them as direct ancestors of HIV or not. The not, Worobey claims, refutes the OPV hypothesis.
Hamilton never found SIV in any of the samples he collected. In the “Parisi forest”, Worobey gathered 97 samples and in a single sample was able to amplify a 699bp segment of a gp41/nef region and a ~400bp segment of another region. The former will be the basis of the rest of this analysis.
The Good
I’d like to open by praising this entire research enterprise. I’m purely an armchair and computer guy. I get annoyed when my cleaning lady is two hours late. The idea of going to a dangerous and remote parts of Africa to hunt for ape shit to test a remote theory is completely removed from my life, and I say that in a good way. I like the fact that this sampling isn’t being done in the name of “pandemic preparedness” or whatever fashionable tripe poisoned infectious disease science for the past 20 years. For this, Hamilton, Worobey, and their colleagues deserve a measure of respect.
The aforementioned virus that Worobey is able to partially sequence is called SIVcpzDRC1 (“SIV” for “Simian immunodeficiancy virus”, “cpz” for “chimpanzee”, and “DRC1” for “Democratic Republic of Congo”). What’s interesting about this is that unlike Worobey’s colleagues in e.g. Santiago et al., they don’t go to one of the national parks in Africa with well-characterized chimp communities to do this sampling. They have be closer to Kisangani, the site of the vaccine trials in the 1950s, and as far as I’m aware, all the chimps there are shy and non-habituated to humans. The location where Worobey reports getting his sample is “the Parisi forest” which doesn’t return any search results on google earth. They are way, way in the bush.
The phylogenies
Worobey creates a phylogeny of 18 SIV and HIV sequences, the latter representing the deepest clades of all known HIV-1 strains. His preferred phylogeny looks like this.

Reproduction of this is difficult. He gives the accessions for the sequences in the tree in the supplementary material, but there’s no alignment, no naming bridge between Genbank and the figure taxa. Of course there’s no code.
Anyway, I was able to reproduce this figure to an acceptable degree of fidelity.

To ensure the finding that SIVcpzDRC1 clusters away from the HIV sample, this figure needs to be robust to changes in the alignment and to changes in the chosen outgroup. I used two different alignments. The outgroup choice made me the most nervous since it uses a strain “SIVgsn” which no one has ever heard of. I redid the phylogeny using rooting from SIVsmm (sootey mangabey) and SIVgor (gorilla).
Wow look at the gorilla one! Subtype O of HIV-1 is probably from gorillas, not from chimpanzees. Gurtler et al. identified subtype O in 1994, but I don’t believe it was then understood that this probably originated in gorilla SIV. This is another mark against the OPV hypothesis: subtype O is an existence proof that HIV-1 can be transmitted to humans by zoonosis, and no one believes gorilla organs contaminated oral vaccines. With such zoonosis, the bushmeat thesis in general looked more cohesive. In The River, Edward Hooper uses the word “gorilla” twice, and in neither case does it reference this subtype of HIV-1.
The SIVcpzDRC1 sequence
I’m no expert in forensic bioinformatics, but I did do a manual alignment of the gp41/nef region against the panel. Nothing looked terribly amiss to me. SIVcpzDRC1 does have the most depleted GC content of any other virus in the panel, and it looks a bit ATT-enriched, but that’s it.

I also used a IQ-Tree based likelihood test. The idea here is that if you add the new SIVcpzDRC1 sequence to the alignment and assume it’s fraudulent or corrupted, the likelihood of the inferred tree will be lower than the likelihood of the inferred tree without it.
That doesn’t happen here. IQ-Tree judges the tree with the SIVcpzDRC1 is actually a little more likely than without it. This doesn’t prove the sequence isn’t made up, but the crudest forms of fraud become much less likely.
Coda
Taken on its face, the data analysis in “Contaminated polio vaccine theory refuted” seems to hold together; it’s robust to changes in modeling specifications and the sequence in question doesn’t show obvious signs of manipulation. I’d be interested if someone smart weighed in on what the low GC content might mean or if the ATT-enrichment is significant.
I’ll be following this up with two following poasts called “The Bad” and “The Ugly”, which are less flattering toward Worobey’s work on the AIDS origin subject.