Last week, journalists rushed to report on previously undisclosed genetic evidence that mammals sold at the Huanan Seafood Wholesale Market in Wuhan, China—possibly raccoon dogs—might have sparked the COVID-19 pandemic. But to the chagrin of the researchers who conveyed their findings confidentially to a World Health Organization (WHO) advisory group on 14 March, the news broke before they had finished analyzing the data, which consist of RNA and DNA sequences collected at the market in early 2020. Yesterday, however, they posted their complete 22-page report on Zenodo, an open repository of scientific research.
To the report’s authors, 19 evolutionary biologists from six countries, the data support the theory that SARS-CoV-2–susceptible mammals were in the right place at the right time to have passed the virus to humans, triggering the pandemic. And they and others, including WHO’s director-general, have blasted China for not sharing the Wuhan market data sooner.
But critics, many of whom suspect SARS-CoV-2 may have escaped from the Wuhan Institute of Virology (WIV), say the new sequences offer no great insight beyond the confirmation that the seafood market also sold mammals. It is “just preposterous” to suggest this is evidence that animals were actually infected with SARS-CoV-2 and transmitted it to humans, computational biologist Erik van Nimwegen says. In a 2021 letter in Science, he and 17 other scientists—including two who issued the new report—called for a “balanced consideration” of the lab-leak hypothesis.
Several of the new report’s co-authors published two papers in Science in 2022 that pinned the pandemic’s origin on mammals sold at the Wuhan market, stressing that it is one of just four places in the city that sold wildlife susceptible to SARS-CoV-2. Those conclusions are bolstered by the new report, its authors say. “These arguments stand in stark contrast to the absence of evidence for any other SARS-CoV-2 emergence route,” their report concludes.
Regardless of how readers weigh the import of the new data, the Zenodo report clarifies details in some of the original media accounts and offers several fresh insights into the latest COVID-19 origin uproar. Below, Science examines some of the key issues.
How exactly were the data from 2020 found?
Chinese researchers uploaded the sequencing data from their market samples to GISAID, a virology database, in June 2022, in support of a preprint they had posted a few months earlier. The data were originally hidden from other GISAID users but became accessible in January. Florence Débarre, an evolutionary biologist at CNRS, the French national research agency, who has become prominent on social media for her analyses of COVID-19 data and also sparring with lab-leak proponents, says she stumbled onto the sequences and shared them with colleagues. Their analysis found the evidence for coronavirus-susceptible mammals at the market.
So why aren’t those genetic sequences public now?
A day after Débarre and colleagues told a member of the Chinese team what they had found, GISAID made the data invisible, apparently at the submitter’s request. In the Zenodo report, the researchers state their analysis is “not intended for publication in a journal” or meant to scoop the Chinese team’s paper, which is under review by the Nature family of journals. As the report explains, they have downloaded the data but are not making them public yet in hopes the Chinese researchers will soon do so. Still, the report states the group went public with its report because the scientists feel an “unreasonable” amount of time has passed without the sequences going public. “Because the data have been removed from GISAID, we cannot share them. But I wish that other scientists could explore these data, which are very rich,” Débarre says. “The more people work on these data, the more we can make them speak.”
What’s GISAID’s role in this?
The repository has become a key source for SARS-CoV-2 data, helping scientists analyze the evolution of variants and other aspects of the pandemic. But Débarre and colleagues take it to task in their report, accusing it of having “deviated from its stated mission” of speeding the sharing of virological data. For its part, GISAID claimed in an initial statement that the researchers violated its access agreements and it “temporarily suspended” their access to the database. The disagreement came down to GISAID’s assertion that the researchers did not follow its access agreement and “make best efforts to collaborate” with the Chinese team.
After the suspension of the report’s authors, Michael Worobey, an evolutionary biologist at the University of Arizona who was one of the corresponding authors, replied to GISAID on behalf of the team. He provided emails to two of the Chinese researchers on 11 March in which his group sought to collaborate. The team also made “multiple verbal entreaties,” Worobey wrote, and sent direct messages in a Zoom chat to the Chinese team during the 14 March WHO-sponsored meeting. In a subsequent email to the group, on 22 March, GISAID said as a “show of goodwill,” it would lift the access restrictions and “review all evidence from both parties.”
What does the genetic evidence in the report say?
Chinese researchers, as well as the Chinese government, have waffled about whether mammals were for sale at the market. The new data arguably provide the strongest evidence yet that key SARS-CoV-2-susceptible species were there when COVID-19 emerged. The 2020 Chinese team that visited the market collected 923 “environmental samples” from the market stalls’ containers, surfaces, and drains. In their report, Débarre and colleagues say 49 of those samples infected with SARS-CoV-2 RNA also contained mitochondrial DNA (mtDNA) that clearly identified five mammals: the common raccoon dog, Malayan porcupine, Amur hedgehog, masked palm civet, and hoary bamboo rat. They also found other DNA, as well as RNA from the mammals. “The co-occurrence of SARS-CoV-2 virus and susceptible animal RNA/DNA in the same samples, from a specific section of the Huanan market, and often at greater abundance than human genetic material, identifies these species, particularly the common raccoon dog, as the most likely conduits for the emergence of SARS-CoV-2 in late 2019,” the authors wrote. The group produced a “heat map” that shows the density of SARS-CoV-2 was “hottest” in market areas near stalls that sold the mammals.
Why are raccoon dogs receiving so much attention?
Experiments have shown SARS-CoV-2 easily infects raccoon dogs—commonly raised for fur in China, but also sold for meat in “wet” markets like the one in Wuhan—and that they shed high levels of the virus. The report describes finding raccoon dog mtDNA in six samples from two different stalls in the Wuhan market. A sample from a cart that tested positive for SARS-CoV-2 also had “abundant” raccoon dog genetic material. Far less human genetic material was found in the same sample. The researchers say this suggests—but doesn’t prove—the raccoon dog or dogs on the cart were more likely to have spread the virus than humans working the stall or shopping near it. When they compared the mtDNA in the market samples with ones previously reported by other scientists, the closest match came from a wild raccoon dog, which is distinct from the subspecies raised for fur. This suggests that if raccoon dogs introduced the virus to the market, researchers investigating COVID-19’s origins should look to China’s wildlife trade, not the fur farms.
Do the market locations of the genetic sequences mean anything?
The mtDNA of the SARS-CoV-2 susceptible mammals was found in the southwest corner of the market that also had the “highest density” of SARS-CoV-2 positive samples. “We can now show that plausible animal hosts of SARS-CoV-2 were indeed right where we thought they were, in the small quadrant with the highest concentration of surfaces found to be positive for the virus and a hot spot for live mammal sales,” Worobey says. Human mtDNA was most abundant in other parts of the market. This raises the possibility that the animals transmitted the virus to the humans in the southwest corner of the market several weeks before the samples were taken, and those people had since recovered from their infections. The other infected people at the market, in this scenario, likely were infected by human-to-human transmission.
What are other scientists saying about the newly uncovered sequences?
Based on interviews—ScienceInsider reached out to all 18 authors of a 2021 letter calling for a “balanced consideration” of the lab-leak hypothesis—and social media reactions, the new findings have persuaded some researchers that animals at the market were the likely source of SARS-CoV-2. But for most, it has not changed where they stand on the origin debate. Division also remains on the appropriateness of Débarre and colleagues analyzing the Chinese data. Some who agreed with the WHO director-general that the Chinese researchers should have shared the data long ago applauded the group’s attempt to force the disclosure of the market sequences, whereas others have questioned the ethics of discussing the data before the China team publishes its own paper.
Ravindra Gupta, an infectious disease specialist at the University of Cambridge, says the report, combined with the Science papers by many of the same authors, has led to a shift in his thinking. “It has certainly pushed me toward the animal origin,” says Gupta, who still wonders about the proximity of WIV to the outbreak. “This is very convincing in terms of leading us to the sort of animal reservoir source. Of course, it’s an association, and it’s not a causal relationship necessarily, but I think this is fantastic to see this.”
Akiko Iwasaki, an immunologist at Yale School of Medicine, also signed the 2021 letter calling for exploration of the lab-leak scenario and, like van Nimwegen, is not particularly impressed by the new sequence data. “It does not influence the likelihood of different hypotheses in my mind,” she says. But, she stresses, “Any and all relevant data from the market, and the surrounding areas—including the virology laboratories—should be made public.”
A more generous view comes from David Relman, a Stanford University microbiologist who also signed the letter. “If verified, mtDNA from animals found in swabs whose provenance was confirmed would be helpful,” Relman says. And China’s failure to share these data earlier dismays him. “I think there are likely to be lots of relevant data and other information that have not yet seen the light of day—of relevance to both major hypotheses.”
Jesse Bloom, a virologist who organized the 2021 letter, wrote in a Twitter thread that given the market sequences were collected in early 2020, they were many weeks after the first cases of COVID-19 likely occurred in Wuhan, in November 2019 or at the very start of December. “These data don’t tell us how pandemic began, but every bit of information helps,” Bloom wrote.
Bloom, who published his own paper about SARS-CoV-2 sequences from Chinese researchers that were removed from a different database, also decried that the market sequences are once again hidden. “As general principle, I think that all scientific data related to the early outbreak in Wuhan should be made available,” he tells Science in an email. “So it’s frustrating that despite their now being two public analyses related to these data (the Chinese CDC preprint and the [new] report), the data are still not available.”
Will more unexplored sequencing data from the market emerge?
Worobey and his colleagues also found samples that do not have the “raw sequencing files” available. GISAID said in its statement they had downloaded “an incomplete portion of these data,” and indicated that a more “complete and updated data set will be made available as soon as possible to all GISAID users.” It did not say when these data might become public. “All of these missing data could provide valuable information on the timeline of events at the Huanan market and the provenance of the virus,” the report says.