Close Menu
    Facebook LinkedIn YouTube Instagram X (Twitter)
    Blue Tech Wave Media
    Facebook LinkedIn YouTube Instagram X (Twitter)
    • Home
    • Leadership Alliance
    • Exclusives
    • Internet Governance
      • Regulation
      • Governance Bodies
      • Emerging Tech
    • IT Infrastructure
      • Networking
      • Cloud
      • Data Centres
    • Company Stories
      • Profiles
      • Startups
      • Tech Titans
      • Partner Content
    • Others
      • Fintech
        • Blockchain
        • Payments
        • Regulation
      • Tech Trends
        • AI
        • AR/VR
        • IoT
      • Video / Podcast
    Blue Tech Wave Media
    Home » OpenAI could announce new AI model to directly take on Google
    multimodal-ai
    multimodal-ai
    AI

    OpenAI could announce new AI model to directly take on Google

    By Fiona HuangMay 13, 2024No Comments3 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email
    • OpenAI is holding an event that could see an announcement about a new multimodal digital assistant on Monday,
    • Being multimodal would enable the helper to use visual cues, like recognising and interpreting a sign outside, as prompts.
    • This poses a direct threat to Google Assistant and the recently released Gemini, the company’s digital assistants.

    OpenAI has been demonstrating to some of its clients a new multimodal AI model that can recognise objects and converse with you according to a recent report from The Information, the news website. The outlet claims to have seen it from anonymous sources and speculates that this may be a preview of what the company will unveil later today. 

    New multimodal AI model

    Multimodal refers to the AI’s ability to process more than just text as input. This supposed digital assistant would be able to connect to a camera, process data from the outside world, and then respond to you with additional details about what it observed. For instance, you could ask ChatGPT to recognise and translate a sign for you when you point a camera at one that is written in a language other than your own. The AI would then converse with you.

    If this sounds familiar, it’s because Google Lens, Google Assistant, and—most recently—Google Gemini have all already accomplished this. ChatGPT is already capable of doing this, albeit not via a single interface. 

    The new model can interpret images and audio faster and more accurately than its separate transcription and text-to-speech models according to reports. The Information claims that the model “theoretically” can assist students with math or translate real-world signs and that it would be able to assist customer service representatives in “better understanding the intonation of callers’ voices or whether they’re being sarcastic.”

    In other words, a direct competitor to Gemini (and, subsequently, Google Assistant and Apple’s Siri).

    The model can “answer some types of questions” better than the GPT-4 Turbo, but it can still make confident mistakes according to sources close to the outlet.

    Also read: How do autonomous vehicles work?

    Also read: OpenAI fights misinformation with tech collaboration

    Speculation on OpenAI

    Developer Ananay Arora shared a screenshot of the call-related code above, suggesting that OpenAI may be preparing a new built-in ChatGPT feature as well. Arora also discovered proof that OpenAI had set up servers meant for audio and video chat in real-time.

    Additionally, Altman stated that the business is not releasing a new AI-powered search engine. However, if The Information’s report is accurate, it may still deflate Google’s I/O developer conference expectations.

    multimodal AI model OpenAi
    Fiona Huang

    Fiona Huang, an intern reporter at BTW media dedicated in Fintech. She graduated from University of Southampton. Send tips to f.huang@btw.media.

    Related Posts

    Indosat deploys Nokia AI to cut network emissions

    July 8, 2025

    Huawei’s AI lab denies copying Alibaba’s Qwen model

    July 8, 2025

    HPE completes Juniper deal under DOJ terms

    July 7, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    CATEGORIES
    Archives
    • July 2025
    • June 2025
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023

    Blue Tech Wave (BTW.Media) is a future-facing tech media brand delivering sharp insights, trendspotting, and bold storytelling across digital, social, and video. We translate complexity into clarity—so you’re always ahead of the curve.

    BTW
    • About BTW
    • Contact Us
    • Join Our Team
    TERMS
    • Privacy Policy
    • Cookie Policy
    • Terms of Use
    Facebook X (Twitter) Instagram YouTube LinkedIn

    Type above and press Enter to search. Press Esc to cancel.