The popularity of generative AI software has reached an all time high this year, with music lagging behind other mediums like image and text. Nevertheless, two applications dropped in May and June 2023 that marked a major improvement in the technology. It probably comes as no surprise that the companies behind these apps are Google and Facebook-Meta.
One of Google's research teams published a paper in early January 2023, describing a generative AI music app called MusicLM. The paper detailed a product that could turn text prompts into songs. But perhaps more impressively, it could also take in a melody and incorporate that tune into its final output. Some demos in the paper featured humming and whistling, combined with written descriptions of attributes like genre and instrument, to output a song with that tune, in that style.
When Google launched their MusicLM beta app in May 2023, it included the text prompt feature but lacked the option to upload a melodic condition. This was a bit disappointing to those of us who had been eagerly awaiting the experience of turning our musical ideas into the genre of our choice.
Fortunately, just one month later, Meta has released their own music generator called MusicGen. As if responding to Google and one-upping them, Meta included the melodic audio input feature that Google omitted from their beta app.
In this article I'll share a quick overview of how MIDI generation fits into the picture, along with tips about how to get started with your own experiments.
To date, even the most high profile AI MIDI melody generators have been underwhelming. OpenAI decommissioned their MIDI generation app MuseNet in December 2022, right after the launch of ChatGPT. Google offers a DAW plugin suite called Magenta Studio that includes MIDI generation, but it simply doesn't deliver the quality that any of us would have hoped for.
Experimentally minded folks might have some fun using ChatGPT music prompts to generate MIDI melodies. WavTool is a browser app that supports the ability to do this within a DAW, but it takes a great deal of trial and error to create a good melody. In many cases, you could have composed something yourself in a shorter period of time. This comes down to the fact that large language models are not trained on music composition, despite having a solid grasp of music theory concepts.
AudioCipher's text-to-MIDI generation VST is another option you may have already explored. It lets you control key signature, chord extensions, and rhythm automation. However, the plugin does not use artificial intelligence. Users encode words and phrases into the MIDI tracks as a source of creative inspiration. The algorithm draws from a classical tradition practiced by both spies and composers, called musical cryptography.
Suffice to say, each of these options has pushed the game forward, but none of them have perfected the MIDI song generation experience. Instead of waiting around for AI MIDI generators to get better, I propose using Meta's MusicGen application in combination with an audio-to-midi converter. We'll get into that next.
Turning your MIDI melodies into full songs
To get started, create a MIDI melody in your DAW and export it as an audio file. It's best to use a sine wave or a clean instrument without any effects. Once the audio file is ready, upload it to MusicGen and include a text prompt that describes the type of music you want to generate.
I've created a video demo (shown above) with AudioCipher's text-to-MIDI melody generator and MusicGen. We created a short MIDI track, exported it as a wav file and then fed it into the Melody Condition container in Hugging Face. From there, we were able to use text prompts to turn the same tune into 15 different genres of music.
To learn more, see this article on how to use MusicGen for music production, including suggestions on the best prompts to use with the app. I've also included an important tip for managing your Hugging Face account settings, to avoid accidentally racking up a large bill!
Now that you've seen how MusicGen works and may have even created an audio file of your own, the last step is to pass that file back through a polyphonic audio-to-midi converter like Samplab 2, Basic Pitch, or Melodyne.
A word of advice; MusicGen produces a lot of noise, so if you have noise reduction software, I recommend using that before passing it through a MIDI converter. Noise tends to be misinterpreted as tonal content, so cleaning it up will save you time later.
Here are the three best audio-to-midi converters that I've found:
Samplab 2 is my favorite option for audio-to-midi because it detects and separates instrument layers before transposing each one into MIDI. MusicGen tends to add drum layers to tracks even when you ask it not to. Samplab will separate those drums out, so you can isolate tonal instruments like piano, guitar and bass. The app is available as a DAW plugin and standalone app, with drag-to-midi capabilities.
Basic Pitch is a free alternative to Samplab that was built by Spotify and runs in your browser. It mashes everything together in a single piano roll, so I would only recommend using it for single-instrument audio files. If the track is too complex, Basic Pitch will omit a large part of the music, while simultaneously adding excessive rhythmic articulations due to noise and effect layers.
Melodyne 5 is a high quality application that supports single-instrument polyphonic MIDI conversion only. It won't separate instruments into their own tracks, but it handles solo piano and guitar very well. You get what you pay for and to be blunt, Melodyne is expensive. So if you already have Melodyne, go ahead and try it out with this workflow. Otherwise Samplab is probably your best bet.
There you have it. Once you've converted the MusicGen audio file into MIDI, you can pull it down to your DAW and clean things up further in the piano roll. You'll have an expanded arrangement based on the initial MIDI idea. But now you can add your own virtual instruments and sound design to tighten up the quality.
This might seem like a lot of information, but the whole process takes about 2 minutes, from creating an audio file in MusicGen to passing it through an audio-to-MIDI converter. You may need to spend more time fine tuning your text prompt to get the sound that you're after. MIDI clean up in the DAW will also require a little work. But hey, it is what it is.
I hope this primer has given you some food for thought and an entry point to deepening your AI music discovery process. These workflows might become obsolete in the coming year as the technology continues to improve. For now, this is one of the best methods I've found for developing a MIDI melody and turning it into a full song with artificial intelligence. Visit our site to find this complete guide to AI music apps in 2023.
When you subscribe to the blog, we will send you an e-mail when there are new updates on the site so you wouldn't miss them.