Apple unveils training details of Apple Intelligence models

  • Apple has released a technical paper detailing the models used for Apple Intelligence, emphasising that its training data is responsibly sourced and no private user data is used.
  • It also mentions that Apple filters open source code through licences and uses publicly available datasets to train its AI models, while taking steps to reduce the risk of models outputting bad content.

OUR TAKE
The technical paper describes the training process of the AI models it has developed for Apple Intelligence, making it clear that the models are trained on publicly available and licensed datasets, ensuring that no private user data was used, thus emphasising its principles of respect for user privacy and responsible AI development.

-Rae Li, BTW reporter

What happened

Apple has released a technical paper which has the training process for the AI models it developed for Apple Intelligence. In the paper, Apple refutes allegations that it takes an ethically questionable approach to training its AI models, reiterating that it does not use private user data, but instead utilises publicly available and licensed data. Apple mentions that its pre-training datasets include licensed data from publishers, filtered publicly available or open source datasets, and publicly available information crawled by its web crawler, Applebot. Additionally, Apple emphasises its protection of user privacy, specifying that the data mix does not contain any private Apple user data.

In further details, Apple reveals the sources of training data for its AFM (Apple Foundation Models) models, including publicly available web data and licensed data from undisclosed publishers. Apple also used open source code hosted on GitHub for training, specifically Swift, Python, C, Objective-C, C++, JavaScript, Java and Go code. To improve the mathematical skills of the model, Apple specifically included mathematical questions and answers from web pages, maths forums, blogs, tutorials and workshops. Moreover, Apple acquired additional data, including from human feedback and synthetic data, to fine-tune the AFM model and try to reduce the risk of undesirable behaviour. Apple says its model is created to help users perform everyday activities on their Apple products while following Apple’s core values and principles of responsible AI.

Also read: Apple retail workers score contract after Union push

Also read: Apple commits to AI safety in White House Initiative

Why it’s important 

This paper can be indicated how Apple develops and trains its AI models while protecting user privacy. Against the current backdrop of growing concerns about data privacy and security, Apple’s clear statement that the training data for its AI models does not contain any private user data helps to enhance consumer trust in Apple products. 

In addition, Apple’s emphasis on transparency of data sources and principles of responsible AI development sets a positive benchmark for the industry, demonstrating how open and authorised data can be used for technological innovation without infringing on user privacy. Apple’s disclosure of AI model training details provides an important reference for the technology community and regulators. 

Rae-Li

Rae Li

Rae Li is an intern reporter at BTW Media covering IT infrastructure and Internet governance. She graduated from the University of Washington in Seattle. Send tips to rae.li@btw.media.

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *