Meta has announced the release of five major new AI models and research projects, showcasing advancements in multi-modal systems, language models, music generation, AI speech detection, and diversity improvement in AI systems.
Chameleon: Multi-modal Text and Image Processing
Among the releases is ‘Chameleon,’ a family of multi-modal models capable of processing and generating both text and images simultaneously. Unlike traditional large language models, Chameleon can take any combination of text and images as input and deliver both as output. Potential applications range from generating creative captions to prompting new scenes.
Multi-token Prediction for Faster Language Model Training
Meta also introduced pretrained models for code completion using ‘multi-token prediction’ under a non-commercial research license. This method predicts multiple future words simultaneously, making language model training more efficient compared to the traditional one-word prediction approach.
JASCO: Enhanced Text-to-Music Model
In the realm of creative AI, Meta’s JASCO model allows for the generation of music clips from text with greater control, accepting inputs like chords and beats to refine the music output. This advancement offers more nuanced control over music generation compared to existing models.
AudioSeal: Detecting AI-generated Speech
Meta claims its new AudioSeal system is the first audio watermarking system designed to detect AI-generated speech. Capable of identifying AI-generated segments within larger audio clips up to 485 times faster than previous methods, AudioSeal is released under a commercial license as part of Meta’s effort to prevent the misuse of generative AI tools.
Improving Text-to-Image Diversity
Another significant release aims to enhance the diversity of text-to-image models, addressing geographical and cultural biases. Meta developed automatic indicators for evaluating geographic disparities and conducted an extensive study with over 65,000 annotations to understand global perceptions of geographic representation. This research has led to improved diversity and representation in AI-generated images.
These advancements come from Meta’s Fundamental AI Research (FAIR) team, which has been advancing AI through open research and collaboration for over a decade. Meta believes that working with the global community is crucial as AI rapidly evolves.
“By publicly sharing this research, we hope to inspire iterations and ultimately help advance AI in a responsible way,” said Meta.
By sharing these innovative models, Meta aims to foster collaboration and drive innovation within the AI community, ensuring the responsible advancement of AI technology.