Author: Bal Marsius

Bal was BTW's copywriter specialising in tech and productivity tools. He has experience working in startups, mid-size tech companies, and non-profits.

Interest in generative AI models has surged, driven by advancements in natural language processing and image generation.Interest in generative AI models has surged, driven by advancements in naturallanguage processing and image generation. META, a prominent player in the AI researchdomain, has introduced CM3leon, a cutting-edge multimodal model. Multimodal meansthe AI is capable of both text-to-image and image-to-text generation.CM3leon’s unique approach combines a recipe derived from text-only language models.Meta’s model will employ large-scale retrieval-augmented pre-training and multitasksupervised fine-tuning stages. Better Performance in Image GenerationDespite being trained with five times fewer computational resources than previoustransformer-based methods, CM3leon achieves state-of-the-art performance in text-to-image generation. Notably, it exhibits the versatility of autoregressive models whilemaintaining low training costs and efficient inference. This tokenization-based model goes beyond conventional text-to-image approaches. Itcan generate complex sequences of text and images conditioned on arbitrary content.Unlike other specialized image generation models, CM3leon’s large-scale multitaskinstruction tuning significantly enhances performance across various vision-languagetasks, such as image caption generation and visual question answering. Ethical Image Data SourcingMeta announced that it takes an ethical approach to image data sourcing, using onlylicensed images from Shutterstock to avoid issues related to ownership and attribution.This socially responsible methodology sets CM3leon apart from its competitors.In a comparison with widely-used benchmarks, CM3leon achieves an impressive FIDscore of 4.88, outperforming Google’s Parti model and setting a new standard for text-to-image generation. A Frechet Inception Distance (FID) score of 0.0 indicates a perfectscore.CM3leon exhibits an ability to generate intricate compositional objects, evident inexamples like a potted cactus donning sunglasses and a hat.…