Facebook's parent company, Meta unveiled a new generative AI tool called "AudioCraft" that can generate music and audio from text prompts on Thursday.
According to the release issued by the company, there will be three models within AudioCraft. These are MusicGen, AudioGen, and EnCodec.
MusicGen is trained specifically with licensed music owned by Meta and generates music from text prompts.
Also Read
AudioGen is trained on "public sound effects" and generates sounds from text prompts. The EnCodec decoder which was released yesterday, allows for high-quality music generation with "fewer artifacts".
The company is releasing its pre-trained AudioGen models which can generate environment sounds as well as sound effects such as a dog barking, cars honking, or footsteps on a wooden floor. The company will also release all of the AudioCraft model weights and codes.
Meta will open-source all of these models, especially giving access to researchers and practitioners so they can train their own models and create more datasets that can contribute to the advancement of the field of AI-generated audio and music.
In the release, Meta shares, "While we’ve seen a lot of excitement around generative AI for images, video, and text, audio has seemed to lag a bit behind. There’s some work out there, but it’s highly complicated and not very open, so people aren’t able to readily play with it."
Meta in its release adds that it hopes the AudioCraft "family of models" can be used as tools for musicians and sound designers "to provide inspiration, help people quickly brainstorm and iterate on their compositions in new ways."
Meanwhile, Meta is still working to retain users on its new text-based app, Threads after it lost half of its users weeks after its launch on July 5, Reportedly, the company is focusing on preparing AI-powered chatbots to increase engagement on the platform.