Google Created An AI That Can Generate Music From Text Descriptions, But Wont Release It

Admin

7 Feb, 2023

Google Created An AI That Can Generate Music From Text Descriptions, But Wont Release It

Google's impressive new artificial intelligence system can generate music of all genres with lyrics descriptions. However, fearing the risks, the company has no plans to launch it anytime soon.

Google with the name MusicLM is certainly not the first generative artificial intelligence system for songs. There are other efforts, including Riffus, an AI that composes music by visualizing it, as well as Dance Diffusion, Google's AudioML, and OpenAI's Jukebox. However, due to technical limitations and limited training data, none of them could compose songs that were very complicated or with high precision.

MusicLM may be the first to do so.

As detailed in the research paper, MusicLM was trained on a dataset of 280,000 hours of music to learn how to create coherent songs to describe what its creators call "significant complexity" (e.g. "a beautiful jazz melody with a catchy saxophone"). . Solos and vocals” or “90s Berlin techno with deep basses and heavy kicks”. The songs sound surprisingly man-made, although they're not necessarily musically inventive or coherent.

It's hard to overstate how good the samples sound as there are no musicians or instrumentalists in the cycle. Even with a somewhat long and complicated description, MusicLM manages to capture nuances such as instrumental riffs, melodies and moods.

For example, the example title below contains the phrase "makes you lost in space" and fits perfectly (at least to my ears):

Here's another example created from a description that begins with the phrase "Main soundtrack for an arcade game." Makes sense right?

MusicLM's AI capabilities are not limited to creating short music videos. Google researchers have shown that the system can be based on an existing tune, be it humming, singing, whistling or playing an instrument. In addition, MusicLM can consistently create a kind of "story" through various written descriptions (like "time to meditate", "time to wake up", "time to run", "time to give 100%"). or melodic narrative. a few minutes, perfect for a movie soundtrack.

See below where the sequence "electronic song plays in video game", "meditation song plays by the river", "fire", "fireworks" comes from.

Not only that. MusicLM can also be instructed through a combination of images and subtitles, or produce sounds "played" by specific types of instruments in specific genres. Even the experience level of the AI “musician” can be adjusted, and the system can inspire music (e.g. motivational exercise music) depending on location, time of day or demand.

But MusicLM really is far from perfect. Some of the samples were of distorted quality, an inevitable side effect of the training process. And while MusicLM can technically produce vocals, including choral harmonies, they leave a lot to be desired. Most of the "lyrics" range from almost English to pure gibberish, performed by synthetic vocals that sound like a mix of different artists.

However, Google researchers found that many of the ethical issues caused by systems like MusicLM, including the tendency to include copyrighted material from training data in the songs they compose. During their experiments, they found that about 1% of the music produced by the system was played directly from the songs it was being trained on - a threshold that seemed high enough to prevent them from using MusicLM in its current state to execute.

"We recognize the potential risks of misusing creative content in the context of use cases," the paper's co-authors wrote. "We emphasize the need for further work to address the risks associated with this musical generation."

Assuming MusicLM or such a system becomes available one day, serious legal problems seem inevitable even if the system is positioned as a tool to support artists, not replace them. They already have, albeit with a simpler AI system. In 2020, Jay-Z's record company filed a copyright infringement lawsuit against YouTube channel Vocal Synthesis for using AI to create Jay-Z covers of songs like Billy Joel's "We Didn't Start the Fire." create. After the video was originally removed, YouTube restored it and found that the removal request was "incomplete." However, the legal basis for fake music is still unclear.

The whitepaper, written by Eric Sunray, currently a law student at the Music Publishers Association, claims that AI music generators like MusicLM are violating music copyright law by "continuously making audio recordings of works that they use for instruction, and thereby against the duplication of the US." -Infringing copyright." "Good" laws. After the release of Jukebox, critics also questioned whether training an AI model with copyrighted music material was fair use. Similar concerns were raised about training data used in AI systems that use images, generate code and text is often pulled from the internet without the knowledge of the creator.

From a user perspective, Waxy's Andy Baio suggests that AI-generated music would be considered a derivative work, in which case only the original elements would be copyrighted. Of course, it's not clear what counts as "authentic" in this music; exploiting this music commercially meant breaking new ground. It's easier when the music produced is used for purposes protected by fair use, like parody and commentary, but Baio expects the courts to rule on a case-by-case basis.

Perhaps there will soon be clarity on this matter. Some of the lawsuits expected to be filed relate to the creation of AI music, including the rights of artists whose works have been used to train AI systems without their knowledge or consent. But time will tell.