Close Menu
  • Home
  • Leadership Alliance
  • Exclusives
  • History of the Internet
  • AFRINIC News
  • Internet Governance
    • Regulations
    • Governance Bodies
    • Emerging Tech
  • Others
    • IT Infrastructure
      • Networking
      • Cloud
      • Data Centres
    • Company Stories
      • Profile
      • Startups
      • Tech Titans
      • Partner Content
    • Fintech
      • Blockchain
      • Payments
      • Regulations
    • Tech Trends
      • AI
      • AR / VR
      • IoT
    • Video / Podcast
  • Country News
    • Africa
    • Asia Pacific
    • North America
    • Lat Am/Caribbean
    • Europe/Middle East
Facebook LinkedIn YouTube Instagram X (Twitter)
Blue Tech Wave Media
Facebook LinkedIn YouTube Instagram X (Twitter)
  • Home
  • Leadership Alliance
  • Exclusives
  • History of the Internet
  • AFRINIC News
  • Internet Governance
    • Regulation
    • Governance Bodies
    • Emerging Tech
  • Others
    • IT Infrastructure
      • Networking
      • Cloud
      • Data Centres
    • Company Stories
      • Profiles
      • Startups
      • Tech Titans
      • Partner Content
    • Fintech
      • Blockchain
      • Payments
      • Regulation
    • Tech Trends
      • AI
      • AR/VR
      • IoT
    • Video / Podcast
  • Africa
  • Asia-Pacific
  • North America
  • Lat Am/Caribbean
  • Europe/Middle East
Blue Tech Wave Media
Home » BREIN delists a language dataset used for AI training
Ai training-0814
Ai training-0814
AI

BREIN delists a language dataset used for AI training

By Lily YangAugust 14, 2024No Comments2 Mins Read
Share
Facebook Twitter LinkedIn Pinterest Email
  • BREIN, a Dutch copyright enforcement organisation, removed a linguistic data set used to train AI models.
  • Danish Rights Alliance removed copyrighted material used for AI model training without permission.

OUR TAKE 
Just as everything in society should be regulated, the training of AI models requires formal, legal, specialised and authorised content from a permitted channel. BREIN’ s move highlights the importance of respecting copyright laws in AI development and the need for transparency in AI model training.
–Lily,Yang, BTW reporter

What happened

A Dutch copyright protection organisation, BREIN, has removed a massive language dataset used for training AI models after it was found to include unauthorised content from books, news websites, and movie and TV show subtitles.

The dataset, which comprised Dutch-language information, was gathered without permission. Bastiaan van Ramshorst, the director of BREIN, stated that it is unclear whether the dataset has been used by AI companies or to what extent.

He noted that the European Union‘s AI Act will require AI companies to disclose the data sets used for training their models. For instance, Danish Rights Alliance asked a large dataset called “Books3” to take down last year. Similarly, in the US, the legal challenges of using copyright material to train AI models faced by AI companies, such as Microsoft.

Also read: Explained: How Telecom Regulator Is Cracking Down On Spam, Fraud Calls

Also read: Telecom fraudsters target younger generation

Why it’s important  

The action of  BREIN and European Union’s AI Act, as well as the challenge of Microsoft highlights the ongoing debate surrounding copyright infringement in the development of AI models.

The removal of the large language dataset by BREIN underscores the importance of respecting intellectual property rights. It is crucial for AI companies to ensure that they have proper authorisation to use data from copyrighted sources to train their models.

The impending European Union’s AI Act and the legal challenges faced by AI companies in the US emphasise the need for transparency and accountability in AI development. As AI continues to advance, it is essential to strike a balance between innovation and copyright protection.

AI training BREIN copyright dataset
Lily Yang

Lily Yang is an intern reporter at BTW media covering artificial intelligence. She graduated from Hong Kong Baptist University. Send tips to l.yang@btw.media.

Related Posts

Deutsche Telekom and Schwarz Group in advanced talks to build joint ‘AI gigafactory’ data centre

December 2, 2025

Cisco and Asiacell join forces to bring AI-driven network assurance to Iraq

December 2, 2025

ZTE awarded top global honour for Malaysian 5G network revamp

December 2, 2025
Add A Comment
Leave A Reply Cancel Reply

CATEGORIES
Archives
  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023

Blue Tech Wave (BTW.Media) is a future-facing tech media brand delivering sharp insights, trendspotting, and bold storytelling across digital, social, and video. We translate complexity into clarity—so you’re always ahead of the curve.

BTW
  • About BTW
  • Contact Us
  • Join Our Team
  • About AFRINIC
  • History of the Internet
TERMS
  • Privacy Policy
  • Cookie Policy
  • Terms of Use
Facebook X (Twitter) Instagram YouTube LinkedIn
BTW.MEDIA is proudly owned by LARUS Ltd.

Type above and press Enter to search. Press Esc to cancel.