Meta introduces generative AI model 'CM3leon' for text, images

Meta has recently unveiled a state-of-the-art generative AI model focused on text-to-image generation, aiming to achieve exceptional performance in this field. This announcement comes at a time when AI-powered image generators have gained popularity and become more accessible. Many prominent companies and emerging startups currently rely on these models for their day-to-day operations.

According to media reports, Meta expects their new AI model to generate more coherent imagery while effectively analyzing input prompts. Current widely used AI-based image generators such as DALL-E2, Google's Imagen, and Stable Diffusion employ a diffusion process for art creation. These models gradually remove noise from an image to learn and operate effectively based on given prompts.

However, the diffusion process is resource-intensive, expensive, and time-consuming. In contrast, Meta's CM3leon model leverages an attention mechanism that places greater emphasis on the input prompt, whether it's in the form of text or an image. CM3leon is anticipated to offer improved efficiency, requiring less computational power and a smaller dataset compared to other models.

To train CM3leon, Meta utilizes a dataset consisting of millions of licensed images, despite the legal challenges the company has faced regarding information misuse. While traditional image generators often struggle with complex objects and understanding prompts, some of the images generated by CM3leon demonstrate its ability to handle intricate designs.