Any paper that uses the acronym CNN to stand for “convolutional brain organization” probably wasn’t carefully written and revised by human authors. Instead, researchers say, such “tortured acronyms” are likely the work of software that altered earlier wording—“convolutional neural network” in this case—to disguise plagiarism, while neglecting to change the acronym. Now, journals have an automated tool for finding such suspicious mismatches, which often signal a serious problem with the paper.

The group behind the acronym detection, led by University of Toulouse computer scientist Guillaume Cabanac, previously developed a range of automatic misconduct detectors on the publicly available Problematic Paper Screener (PPS). The system automatically scans the scientific literature weekly and flags papers that have tortured phrases—nonsensical paraphrases such as “glucose bigotry” instead of “glucose intolerance”—cell lines that do not exist, and other giveaways that signal potentially grave problems.

Now, the group has added tortured acronyms to its list of red flags and is offering free software for publishers to screen for previously unidentified tortured acronyms in paper submissions, they will report next week at the World Conference on Research Integrity in Athens, Greece. It will also announce that its system has uncovered a major trove of thousands of suspicious conference papers from 11 different publishers using tortured phrases.

The PPS is “some of the most significant work” in the growing field of misconduct detection, says Ludo Waltman, who studies scholarly publishing and research assessment at Leiden University. Anyone can download the list of known tortured phrases from the PPS, although Cabanac says he knows of only a few publishers screening submissions against them. But Kim Eggleton, who supervises peer review and research integrity at IOP Publishing (IOPP) is one of those who do—and adding tortured acronyms to the surveillance will make it more powerful, she says. Her staff spots acronyms that do not match the phrases they are standing in for “reasonably often. … A way to automate that process would be amazing.” But detecting deception is only half the battle. To date, the PPS has flagged more than 15,000 papers with tortured phrases; only 2760 have been retracted.

The effort has been ongoing since 2021, when Cabanac and his collaborators launched the PPS and began to identify tortured phrases. The list of phrases has grown manually and laboriously to more than 5000 known “fingerprints” of potential misconduct.

To expand the PPS’s functionality to acronyms, Alexandre Clausse, a Ph.D. student at the University of Toulouse, sampled 75 papers the PPS had already flagged. He used the acronyms in these papers—both tortured and not—to build software that could automatically detect additional suspicious acronyms, based on the giveaway of mismatched initials. In testing the software’s accuracy, he generated a list of 185 new acronym fingerprints to add to the PPS, as well as software that publishers can use to detect previously unidentified tortured acronyms in paper submissions.

One rich source of “tortured articles” is conference proceedings—potentially because the review process for these is often run separately from a publisher’s normal review process, Eggleton says. In earlier work, Cabanac and his colleagues flagged hundreds of conference papers published by the Institute of Electrical and Electronics Engineers for containing tortured phrases. Now, they are reporting similar problems in conference proceedings from 10 other publishers, including IOPP.

IOPP’s experience illustrates how tortured language can expose deeper problems. In 2022, IOPP retracted nearly 500 papers from conference proceedings after the PPS flagged tortured phrases in the papers. When Eggleton and her team investigated, they found reams of other problems—fake identity, citation cartels in which researchers insert irrelevant references to one another, and even entirely fabricated research. “The tortured phrase is what makes you look in the first place,” she says. “It’s always an indicator that something somewhere is not quite right.” She suspects the hundreds of conference papers were all generated by a paper mill—an organization that sells authorship on fake papers to researchers desperate to boost their list of publications.

Because of the problems with conference proceedings, IOPP has changed how conferences are handled. The society publisher has invested heavily in new technology and expanded the team that oversees their conference proceedings. All their conference content is now subjected to a range of screening processes—including for tortured phrases. “Our rates of misconduct dropped off a cliff at that point,” Eggleton says. But as bad actors learn what publishers screen for and change their tactics—including better disguised tortured acronyms—“It feels like we’re always slightly one step behind.”

The problems being detected by the PPS are a signal of broader problems in academic publishing, Waltman says. Researchers working in “publish or perish” cultures are incentivized to churn out as many papers as possible, pushing some to resort to unethical tactics such as buying publications from paper mills. And open-access publishers that charge authors to publish their work also have an incentive to publish as much as possible, which could lead to laxer quality control. But although screening at publishers is helpful, the arms race between screening methods and tricksters means it is unlikely to stem the tide of problematic papers entirely. The only real way to properly solve the problem is to fix the incentive systems, Waltman says: “We need to tackle this problem at the source.”

More: https://www.science.org/content/article/software-detects-tortured-acronyms-in-research-papers