Close Menu
    Facebook LinkedIn YouTube Instagram X (Twitter)
    Blue Tech Wave Media
    Facebook LinkedIn YouTube Instagram X (Twitter)
    • Home
    • Leadership Alliance
    • Exclusives
    • Internet Governance
      • Regulation
      • Governance Bodies
      • Emerging Tech
    • IT Infrastructure
      • Networking
      • Cloud
      • Data Centres
    • Company Stories
      • Profiles
      • Startups
      • Tech Titans
      • Partner Content
    • Others
      • Fintech
        • Blockchain
        • Payments
        • Regulation
      • Tech Trends
        • AI
        • AR/VR
        • IoT
      • Video / Podcast
    Blue Tech Wave Media
    Home » Google’s DeepMind unveils ‘superhuman‘ AI fact-checker, ‘SAFE’
    AI

    Google’s DeepMind unveils ‘superhuman‘ AI fact-checker, ‘SAFE’

    By Jennifer YuMarch 29, 2024No Comments2 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email
    • Search-Augmented Factuality Evaluator (SAFE) is a method that uses a large language model (LLM) to break down generated text into individual facts.
    • This “superhuman” AI system can improve fact-checking, cost efficiency, and accuracy.
    • Gary Marcus, a prominent AI researcher, suggested “superhuman” might simply mean better than an underpaid crowd worker, rather than a true expert fact-checker.

    Google DeepMind has unveiled a “superhuman” AI system that can outperform human fact-checkers in assessing the accuracy of information generated by large language models.

    Search-Augmented Factuality Evaluator (SAFE)

    This study, titled “Long-form factuality in large language models”, introduces SAFE as a method for decomposing generated text into individual facts using large language models. It then uses Google Search results to determine the accuracy of each claim.

    The researchers pitted SAFE against human annotators on a data set containing around 16,000 facts and found that SAFE’s ratings matched human ratings 72% of the time. Even more impressively, when there were disagreements between SAFE and human raters, SAFE’s judgement was correct in 76% of cases.

    Also read: Microsoft hires DeepMind co-founder Mustafa Suleyman as CEO of new AI unit  

    ‘Superhuman’ performance caused controversy

    While researchers claim that large language model agents can achieve “superhuman” rating performance, some experts question what “superhuman” really means here.

    AI researcher Gary Marcus suggests that “superhuman” may simply mean better than an underpaid crowd worker, rather than a true expert fact checker.

    Marcus argues that benchmarking SAFE against expert human fact-checkers is crucial to truly demonstrate its superhuman performance.

    Advantages of SAFE

    A clear advantage of SAFE is cost – the researchers found that using the AI system was about 20 times cheaper than using human fact-checkers. As the amount of information continues to grow, it is increasingly important to adopt a low-cost, high-return approach.

    The DeepMind team also used SAFE to evaluate the factual accuracy of 4 families (Gemini, GPT, Claude, and PaLM-2) of 13 top language models. They found that larger models typically produce fewer factual errors.

    However, even the best-performing models still produced a large number of false statements.

    This highlights the risk of over-reliance on language models that can fluently express inaccurate information. Automated fact-checking tools like SAFE can play a key role in mitigating these risks.

    AI DeepMind Google SAFE
    Jennifer Yu

    Jennifer Yu is a reporter at BTW Media covering artificial intelligence and products. She graduated from The University of Hong Kong. Send tips to j.yu@btw.media.

    Related Posts

    Trump set to unveil AI guidelines with energy focus

    July 22, 2025

    Amazon AWS cuts hundreds of jobs amid AI restructuring

    July 21, 2025

    Telekom backs Gen Z’s AI doppelgangers for identity exploration

    July 21, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    CATEGORIES
    Archives
    • July 2025
    • June 2025
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023

    Blue Tech Wave (BTW.Media) is a future-facing tech media brand delivering sharp insights, trendspotting, and bold storytelling across digital, social, and video. We translate complexity into clarity—so you’re always ahead of the curve.

    BTW
    • About BTW
    • Contact Us
    • Join Our Team
    TERMS
    • Privacy Policy
    • Cookie Policy
    • Terms of Use
    Facebook X (Twitter) Instagram YouTube LinkedIn

    Type above and press Enter to search. Press Esc to cancel.