Institution Profiling / Internet infrastructure institution

AI models trained on YouTube videos by Google and Open AI

AI models trained on YouTube videos by Google and Open AI is tracked as a internet infrastructure institution within the internet infrastructure ecosystem.

AI models trained on YouTube videos by Google and Open AI
Caption: AI models trained on YouTube videos by Google and Open AI · Source context: featured article image · Relevance reason: visual context for AI models trained on YouTube videos by Google and Open AI · Image provenance: BTW media library

Sources

Public references used for this article.

External references will appear here after editorial citation review.

CategoryInstitution

AI models trained on YouTube videos by Google and Open AI is tracked as a internet infrastructure institution within the internet infrastructure ecosystem.

RegionGlobal

AI models trained on YouTube videos by Google and Open AI has public-source relevance to network operations, governance, dependency mapping, or market structure.

Signal FocusInternet infrastructure institution

AI models trained on YouTube videos by Google and Open AI has public-source relevance to network operations, governance, dependency mapping, or market structure.

Content TypeProfile

AI models trained on YouTube videos by Google and Open AI is tracked as a internet infrastructure institution within the internet infrastructure ecosystem.

Primary DomainTechnology

Public-source signals support medium-impact monitoring for infrastructure visibility and dependency analysis.

TopicInternet infrastructure institution

AI models trained on YouTube videos by Google and Open AI is profiled by BTW Media because published evidence links it to internet infrastructure, governance, operational dependencies, or market visibility.

ImpactMedium

Public-source signals support medium-impact monitoring for infrastructure visibility and dependency analysis.

Confidence?Confidence Grade
0.90–1.00AHigh — direct sources
0.75–0.89A/BStrong
0.55–0.74B/CMedium
0.35–0.54C/DWeak–medium
0.10–0.34DWeak signal
0.00–0.09DInternal monitoring
Limited confidence (76%)

Several public sources

AI models trained on YouTube videos by Google and Open AI is profiled by BTW Media because published evidence links it to internet infrastructure, governance, operational dependencies, or market visibility.

  • OpenAI and Google used the speech recognition tool Whisper to transcribe more than 1 million YouTube videos to train their AI models.
  • OpenAI’s use of YouTube videos may violate Google’s rules, which prohibit the use of its videos for standalone applications as well as access through automated means.

Both OpenAI and Google have turned to transcribing YouTube videos to further train their AI models, which could infringe on creators’ copyrights. The two tech giants cut corners on Meta to get as much data as possible to train their AI models.

Infringement of creator’s video copyright

OpenAI used Whisper to transcribe over a million hours of YouTube video, feeding the transcripts into GPT-4, the AI system used for the ChatGPT chatbot. Google, which owns YouTube, also transcribed videos for AI model training.

The transcriptions of videos by both businesses can violate the copyrights of the original producers. Lawsuits related to copyright and licence have resulted from other uses of creative content for AI training.

OpenAI’s use of YouTube videos may also violate Google’s rules prohibiting the use of its videos for “independent” applications and “automated means (such as bots, botnets, or scrapers)” of accessing its videos.

Also read:Google and Stanford researchers launch AI fact-checking tool

Also read:Google DeepMind CEO Demis Hassabis receives knighthood for AI technology

Allow the use of AI using public data

The New York Times was informed by Google spokesperson Matt Bryant that the business was unaware of any such usage by OpenAI. Google employees were aware of OpenAI’s unlawful usage of YouTube content, but they chose not to intervene as it was acting in a similar manner. Additionally, Google informed the newspaper that it only used content whose creators had consented to this kind of usage of their videos to teach AI. 2023 Google modified its terms of service in July to permit the use of content that is freely accessible online, such as Google Docs and restaurant ratings on Google Maps, for the purpose of further training AI models.

At A Glance

  • Name: AI models trained on YouTube videos by Google and Open AI
  • Type: Internet infrastructure institution
  • Base: Global
  • Profile focus: Institution

What It Does

  • Public records support monitoring of its role, services, and key relationships.

Why It Matters

  • Public-source signals support medium-impact monitoring for infrastructure visibility and dependency analysis.
  • Operational criticality: Medium
  • Time horizon: Next quarter

What To Watch

  • Monitoring focuses on verified service continuity, governance changes, and relationship signals.
NowMedium priority

Track verified source updates, role changes, and current public evidence.

QuarterMedium policy sensitivity

Public-source signals support medium-impact monitoring for infrastructure visibility and dependency analysis.

YearNext quarter outlook

Longer-term relevance depends on verified operating, policy, and relationship changes.

Member Briefing

Deeper Profile Context

Login is required to unlock the full profile briefing and source notes.

Only for Strategy Circle

Strategic Circle Access

Open to all readers. Unlock profile briefings after joining and logging in.

Join Strategic Circle

Only for Leadership Alliance

Leadership Alliance Access

For owners and management of IP-holding companies. Login required to unlock.

Join Leadership Alliance
← BackAll Companies