Introduction to the Problem
The issue of retracted papers being referenced by AI tools has become a significant concern in the scientific community. According to Yuanxi Fu, an information science researcher at the University of Illinois Urbana-Champaign, using retraction as a quality indicator is crucial when a tool is facing the general public. This is because retracted papers have been "struck off the record of science," and people outside of science should be warned about these papers.
The Scope of the Problem
The problem is not limited to one specific AI tool, such as ChatGPT. Other AI tools, including Elicit, Ai2 ScholarQA, Perplexity, and Consensus, have also been found to reference retracted papers without noting the retractions. In a test conducted by MIT Technology Review, these tools were found to reference retracted papers in their answers, with some tools referencing as many as 18 retracted papers.
Efforts to Correct the Issue
Some companies have started taking steps to address the issue. Consensus, for example, has begun using retraction data from a combination of sources, including publishers, data aggregators, and Retraction Watch, a database that manually curates and maintains a list of retractions. As a result, Consensus has significantly reduced the number of retracted papers it cites in its answers. Other companies, such as Elicit, have also started working on aggregating sources of retractions and removing flagged retracted papers from their databases.
Challenges in Creating a Comprehensive Database
Despite these efforts, creating a comprehensive database of retracted papers is a challenging task. Ivan Oransky, the cofounder of Retraction Watch, notes that creating such a database would require significant resources and would need to be done manually to ensure accuracy. Additionally, publishers do not share a uniform approach to retraction notices, which can make it difficult to identify retracted papers.
Complications with Retraction Notices
The lack of a uniform approach to retraction notices is a significant complication in addressing the issue. Publishers may use different labels, such as "correction," "expression of concern," "erratum," and "retracted," to indicate that a paper has been retracted. These labels can be added for various reasons, including concerns about the content, methodology, and data or the presence of conflicts of interest. This lack of consistency can make it difficult for AI tools to accurately identify and flag retracted papers.
Conclusion
The issue of retracted papers being referenced by AI tools is a significant concern that requires attention and action from the scientific community and AI developers. While some companies have started taking steps to address the issue, creating a comprehensive database of retracted papers is a challenging task that requires significant resources and manual effort. It is essential to continue working on this issue to ensure that AI tools provide accurate and reliable information to the public.
FAQs
- What is the problem with AI tools referencing retracted papers?
The problem is that retracted papers have been "struck off the record of science," and people outside of science should be warned about these papers. AI tools that reference retracted papers without noting the retractions can provide misleading information to the public. - Which AI tools have been found to reference retracted papers?
Several AI tools, including ChatGPT, Elicit, Ai2 ScholarQA, Perplexity, and Consensus, have been found to reference retracted papers without noting the retractions. - What are some companies doing to address the issue?
Some companies, such as Consensus and Elicit, have started using retraction data from various sources and removing flagged retracted papers from their databases. - Why is creating a comprehensive database of retracted papers challenging?
Creating a comprehensive database of retracted papers is challenging because it requires significant resources and manual effort to ensure accuracy. Additionally, publishers do not share a uniform approach to retraction notices, which can make it difficult to identify retracted papers. - What can be done to ensure that AI tools provide accurate and reliable information?
To ensure that AI tools provide accurate and reliable information, it is essential to continue working on creating a comprehensive database of retracted papers and to develop more effective methods for identifying and flagging retracted papers.









