This year marks the 20th anniversary of the publication of “What’s in a Picture? The Temptation of Image Manipulation,” where I first discussed the growing issue of image manipulation in biomedical research. Two decades later, while there is greater awareness and more efforts to combat this issue, it remains a significant problem within the biomedical literature. (Note: Throughout this piece, I use “image manipulation” as a generic term to refer to both image manipulation, such as copy/paste, erasure, splicing, etc., and image duplication.)

In 2002, as the managing editor of The Journal of Cell Biology (JCB), I witnessed the transition from paper submissions to digital formats in STM journals. We had just implemented online manuscript submission, and authors frequently submitted figure files in incorrect formats. One day, while assisting an author by reformatting some figure files, I noticed sharp lines around some of the bands in a Western blot image panel, indicating possible manipulation—either through copy/paste or selective intensity alteration.

My immediate reaction was alarm: “This is going to be a problem. We need to address this.” With the support of then-editor-in-chief Ira Mellman, I initiated a policy to scrutinize all figure files in accepted manuscripts for evidence of manipulation before publication. Together with three colleagues—Rob O’Donnell, Erinn Grady, and Laura Smith—I developed a simple screening method involving visual inspection using brightness and contrast adjustments in Photoshop to highlight background inconsistencies or duplications that could signal manipulation.

By then, it was evident that image manipulation was a widespread issue, one that all journals could and should tackle within their production workflows. Though implementing this screening process came with personnel costs, I believed these were justified to protect the integrity of the scientific record.

As far as I know, JCB was the first scholarly journal to implement systematic image screening. While concerns about digital image manipulation in biomedical research had been raised before, in 2002, no one had yet taken systematic steps to detect and prevent it from infiltrating the published literature.

We announced JCB’s new image screening process in an editorial published in September 2002, where I also urged principal investigators to closely examine image data in manuscripts from their labs before submission as another measure to prevent image falsification.

After six months of observing authors’ practices, we introduced the first journal guidelines for handling digital images in July 2003. That December, I discussed with Ken Yamada, a PI at the NIH, ways to share JCB’s image screening policies and practices more broadly. Joan Schwartz, then the assistant director for ethics and education at the NIH, invited us to write an article for the NIH Catalyst, which led to the publication of “What’s in a Picture?” in May 2004, later reprinted in JCB.

Following the article’s publication, I spoke at a conference sponsored by the Office of Research Integrity (ORI) about research on research integrity. Attendees debated the prevalence of research misconduct, as no systematic investigation had been conducted. Estimates at the time suggested that between one in 100 and one in 100,000 researchers had engaged in some form of misconduct. Based on JCB’s image screening, I believed the incidence of image manipulation alone was at the higher end of that range—about one in 100 accepted manuscripts showed manipulation that affected data interpretation.

This talk led to an invitation to speak to the Division of Investigative Oversight at ORI in January 2005. The meeting concluded with a consensus that image manipulation was a significant problem that biomedical research journals would eventually have to address. It also ended with a question from Chris Pascal, then the ORI director: How could we encourage other journals to address this issue? The answer was that a serious case of image manipulation in a high-profile journal would likely prompt action. Then, the Hwang case emerged.

In 2004 and 2005, Woo Suk Hwang and colleagues published two papers in Science claiming the production of human embryonic stem cells. These papers came under scrutiny after a whistleblower in the lab alerted the Korean media to potential ethical violations and data falsification, including image manipulation. After an institutional investigation, both papers were retracted in January 2006.

The Hwang case shook the biomedical research and publishing communities, bringing attention to JCB’s image screening program, which might have detected some of the manipulation in the Science articles before publication. Over the following years, my colleagues and I at Rockefeller University Press trained approximately 25 other publishers in our visual screening techniques, and many—including the publishers of Science—began systematic image screening before publication.

However, other stakeholders—researchers, funders, institutions, and many publishers—remained largely unaware of the extent of the problem and their vulnerability to this form of misconduct, seeming to dismiss it as a minor issue not worth public attention.

In 2010, journal editors began receiving emails from an anonymous whistleblower, “Clare Francis,” who appeared to be combing through the published literature for examples of image manipulation. Initially, many of these allegations seemed bizarre and lacked merit, leading editors to disregard anonymous whistleblowers. This unfortunate attitude hindered efforts to get journals to address image manipulation seriously.

In July 2012, Paul Brookes of the University of Rochester started a personal blog, Science-Fraud.com, where he anonymously posted allegations of image manipulation. Within six months, the site was shut down due to threats of litigation. However, during that time, neuroscientist Brandon Stell founded PubPeer.com, a public forum for post-publication peer review where members could comment on published articles, often anonymously. The right to anonymity on PubPeer has since been upheld in court. Over the past 12 years, PubPeer has grown, and a substantial portion of the comments involve allegations of image manipulation in the biomedical literature.

One prominent poster on the site, Elisabeth Bik, has made thousands of allegations of image manipulation. She gained recognition in 2016 with a landmark paper on the frequency of image duplication in published biomedical research, analyzing 20,000 articles. The rate of image duplication unlikely to be due to clerical error (2% of articles screened) aligned with the rate of manipulation at JCB (1% of articles screened).

Bik’s work has inspired a community of “image sleuths” who scour the literature for examples of image manipulation. The sheer volume of their findings has highlighted the extent of the problem. The public nature of these allegations has increased concern among stakeholders about reputational damage, making them more likely to take action. How the sleuths choose which papers to scrutinize remains unclear, but this uncertainty keeps all stakeholders vigilant.

As these sleuths examine older publications, they have revealed that digital image manipulation occurred long before I recognized it at JCB in 2002. For example, a 2023 misconduct investigation into Marc Tessier-Lavigne at Stanford University led to the retraction of a Cell paper from 1999 and two Science papers from 2001 due to image manipulation.

The first commercially available software for detecting image manipulation appeared in 2007 but was not effective. However, in recent years, several companies have developed algorithms for detecting image manipulation. One standout is ImageTwin, the first to screen for duplications against a database of millions of images from previously published articles.

Despite these advancements, no software developers have provided data on how their algorithms compare to visual screening by trained professionals. The algorithms I’ve tested are fairly good at detecting duplications in micrograph images but less effective with blot images. Their ability to detect manipulations like copy/paste, erasure, or splicing is still disappointing. Any algorithmic output must be evaluated by a trained human to confirm findings.

I strongly encourage the use of these algorithms for large-scale image screening. Screening against millions of previously published articles is only feasible algorithmically, and collaborative efforts are underway to enable publishers to screen their submissions at scale against manuscripts submitted to other publishers.

However, algorithms should not be used to evaluate specific allegations of image manipulation. Visual inspection remains the gold standard for these cases. Algorithms have too many false negatives and false positives to be reliable in this context. A negative result from an algorithm does not prove innocence, and a positive result does not prove guilt.

In some public cases, authors have tried to refute allegations of image duplication using algorithmic methods, but these methods were incorrect compared to visual inspection. Leonid Schneider has also reported cases where publishers incorrectly dismissed allegations of image manipulation based on algorithmic screening. The perception that algorithms are superior to human ability is dangerous at this stage of development. False negatives will allow manipulated image data to remain in the published literature, while false positives will lead to incorrect accusations of misconduct.

In 2015, I founded Image Data Integrity, a consulting firm specializing in image manipulation in biomedical research. I primarily work with institutions investigating allegations made on PubPeer. The same types of manipulations—mostly in blots, but also in micrographs, photographs, and scatter plots—continue to appear. The same types of inconsistencies between loading control and experimental blot image panels (e.g., one spliced, the other not) indicate that they were detected on different blots, invalidating the loading control.

I also hear the same excuses from authors, the most common being, “Oh, it’s just a loading control.” The biomedical research community must reject this view. Without a valid loading control, a blot experiment cannot be properly interpreted, rendering any conclusions unreliable.

The most significant change in the last 20 years has been the dramatic increase in the number of published articles questioned for image manipulation. This increase, driven by several factors, has occurred despite improvements in Photoshop skills.

The surge in published articles, especially low-quality papers, has been partly fueled by predatory publishers, an unintended consequence of the pay-to-publish open access model pioneered by PLoS. The institutional culture of “publish or perish” has also encouraged paper mills, which fabricate papers with manipulated images, often using artificial intelligence.

A major reason for the rise in scrutiny is the growing awareness of how easy it is to manipulate images, an understanding spurred by the work of sleuths like Bik. Twenty years ago, few stakeholders believed that biomedical research images could be easily manipulated. Today, even first-year PhD students know how to manipulate images with Photoshop, sometimes bragging about it.

In recent years, some PIs have taken a proactive approach to addressing the issue by screening images for manipulation before submitting manuscripts to journals. Several private companies now offer pre-submission image screening, but some authors seek such services only after being questioned about potential image manipulation by reviewers or editors.

Bik recently commented that while awareness of image manipulation has increased, misconduct is likely more prevalent today due to the higher number of submissions and increased pressure on researchers to publish. This is particularly true in China, where the stakes are high for scientists to publish in high-impact Western journals. Indeed, sleuths’ findings suggest that image manipulation is more prevalent in Chinese publications than in those from any other country. However, it’s unclear whether the apparent disparity is due to differences in actual misconduct or disparities in scrutiny.

I believe many sleuths target Chinese publications because of the low standard of image quality in some journals published in China. Regardless, all journals should invest in systematic image screening before publication. Image manipulation occurs globally, and the same manipulations that affect published research in China also affect publications in Europe, North America, and other regions.

Most recently, we’ve seen the use of artificial intelligence in paper mills to produce fake images in biomedical research papers. This is just the latest twist in an ongoing saga, and no end is in sight. Two decades ago, image manipulation was virtually unknown in the biomedical research community. Today, it is widely recognized as a serious issue.

While progress has been made, the battle is far from over. To effectively address image manipulation, all stakeholders must proactively screen images for evidence of manipulation and retain source data indefinitely. The preservation of the scientific record depends on it.

More: https://retractionwatch.com/2024/08/12/whats-in-a-picture-two-decades-of-image-manipulation-awareness-and-action/