Trending Now
We have updated our Privacy Policy and Terms of Use for Eurasia Group and its affiliates, including GZERO Media, to clarify the types of data we collect, how we collect it, how we use data and with whom we share data. By using our website you consent to our Terms and Conditions and Privacy Policy, including the transfer of your personal data to the United States from your country of residence, and our use of cookies described in our Cookie Policy.
{{ subpage.title }}
Can robots help us fact-check?
The conventions are over, and presidential debates are nigh.
Vice President Kamala Harris and former President Donald Trump are scheduled to debate each other for the first time on Sept. 10. Presuming it happens – Trump has suggested he might skip it – moderators and independent fact-checkers at news outlets the world over will be looking to hold the candidates accountable for their claims.
In that spirit, we decided to test four AI-powered fact-checking services to see if they’re up to the task of sifting fact from fiction in future debates.
We came in skeptical: AI models are prone to hallucination, meaning they tend to make things up. They’re much better at mimicking writing style than figuring out what’s true and what’s not.
To test our assumptions, we checked out Originality.AI, a paid service that costs about $15 a month, though the kind folks there allowed us to try the product for free. Originality’s focus is testing plagiarism, but they’re expanding into the fact-checking business. They caution that their tool is in beta but claim better accuracy than OpenAI and Meta’s models. We also tested OpenAI’s ChatGPT, Anthropic’s Claude, and the AI search engine Perplexity.
We ran the introductory portion of Trump’s Republican National Convention address through all three tools. “Let me begin this evening by expressing my gratitude to the American people for your outpouring of love and support following the assassination attempt at my rally on Saturday,” Trump said, referencing the events of the previous week. Claude and ChatGPT cautioned that they could not rate claims that were so recent, though ChatGPT offered to search the web and found the statement accurate when it did. Originality mistakenly rated it false, saying that there was no assassination attempt in 2023 — it’s training data stops before 2024. Perplexity performed best, finding the statement accurate and providing information about the attempted assassination in Butler County, PA.
Given the date range limits, we tested a claim from Trump’s 2020 RNC speech about education policy. “Biden also vowed to oppose school choice and close all charter schools, ripping away the ladder of opportunity for Black and Hispanic children,” Trump said. “In a second term, I will expand charter schools and provide school choice to every family in America.” NPR rated this claim “false,” noting that Joe Biden never campaigned on closing charter schools.
Originality got this one correct. “The claim is false. The sources provided, including the Washington Post and Politico, contradict the claim that Joe Biden vowed to oppose school choice and close all charter schools. In fact, Biden has expressed support for charter schools and has not proposed any plans to close them.” Claude, ChatGPT, and Perplexity all called the claim misleading or false and gave ample explanations why.
Lastly, we tested the four services on a claim Harris made in her recent DNC address that in 2020 “Donald Trump tried to throw away your votes.” Perplexity said it’s an “oversimplification of a complex situation” and explained Trump’s legal challenges to the election results, his pressure campaign on state officials, and false claims about widespread voter fraud. Claude and ChatGPT made similar determinations. Originality said the claim was false but admitted that its cited sources indicated that Trump was trying to suppress votes but not throw them out.
Jonathan Gillham, Originality’s founder and CEO, said that the company’s foundation model is limited to information before the end of 2023 but does have access to Retrieval-Augmented Generation, or RAG, which allows models to fetch more current information and process them. But, he said, that process hasn’t yet produced the results he wants in terms of accuracy.
Ultimately, our cursory test shows the possibilities and limitations of AI fact-checking. They’re perhaps most useful for evaluating claims made in the distant past with ample public sources clarifying what’s true and what’s not that it can draw from. That’s not conducive for real-time fact-checking but depends on the work of (human) professionals doing that work. Perplexity performed best, however, and showed a glimpse of what a responsive and up-to-date AI fact-checking system might look like.
Gillham told GZERO he thinks that in the future, real-time AI fact-checking will be possible. “However, like all AI systems, it will not be perfect and make some number of mistakes.”