5 of Fatih Porikli’s most important thoughts on Gen AI

  • Fatih Porikli, an IEEE Fellow and the Global Lead of AI Systems at Qualcomm AI Research, recently spoke on The TWIML AI Podcast about his thoughts on generative AI and traditional computer vision topics.
  • Ongoing efforts in enhancing optical flow algorithms, with techniques like speculative decoding and self-cleaning inversion.
  • Rising use of stereo imaging in XR headsets and autonomous vehicles drives the need for efficient compression techniques. Innovations like parallel hypercoding reduce redundancy while ensuring minimal latency in stereo imaging applications.

OUR TAKE
With the requirements of AI skyrocketed, answering textual questions can no longer satisfy users’ needs. Therefore, the updated AI model is built to have a wider range of functions, including analysing mathematical plots.
–Audrey Huang, BTW reporter

Fatih Porikli, an IEEE Fellow and the Global Lead of AI Systems at Qualcomm AI Research, recently spoke on The TWIML AI Podcast about his thoughts on generative AI and traditional computer vision topics. There are 5 important ideas for his thoughts.

1. Multimodal model advancements

The discussions highlighted significant advancements in multimodal models, particularly those integrating language and image processing. These models aim to interpret complex data, such as mathematical plots, by leveraging information from multiple modalities. This represents a crucial step towards developing AI systems capable of understanding diverse types of inputs and performing complex reasoning tasks.

Also read: OpenAI thwarts 5 covert influence operations using AI models

Also read: AI lies: Should we worry about deceptive AI models?

2. Optical flow optimisation

Researchers are actively working on enhancing optical flow algorithms, which are essential for tasks like video compression and motion analysis. Techniques such as speculative decoding and self-cleaning inversion aim to improve the accuracy and efficiency of optical flow, enabling real-time processing on devices like mobile phones. These advancements address the increasing demand for high-quality video processing across various applications.

3. Efficient compression techniques for stereo imaging

With the rising use of stereo imaging in devices like XR headsets and autonomous vehicles, efficient compression of stereo streams is becoming crucial. Novel approaches like parallel hypercoding and bidirectional shift modules enable stereo-aware compression, reducing redundancy and achieving significant bitrate savings while minimising latency. These techniques pave the way for more effective data transmission and storage in stereo imaging applications.

4. On-device AI demos

Demonstrations showcased practical applications of AI on mobile devices, ranging from portrait relighting and avatar generation to AI assistants with AR face recognition. These demos highlight the potential for on-device AI to enhance user experiences across various domains, including photography, communication, and augmented reality. By running AI algorithms directly on mobile devices, users can access advanced functionalities without relying on cloud-based processing, leading to faster and more seamless interactions.

5. Insights from workshops

The workshops on Efficient Large Vision Models and Omnidirectional Computer Vision provided valuable insights into emerging trends and challenges in vision model development. They emphasised the importance of efficient deployment of large models on edge devices and addressed unique considerations for processing omnidirectional imagery. These workshops serve as platforms for collaboration and knowledge sharing among researchers and industry professionals, driving advancements in vision model research and application.

Audrey-Huang

Audrey Huang

Audrey Huang is an intern news reporter at Blue Tech Wave. She is interested in AI and startup stories. Send tips to a.huang@btw.media.

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *