Font size: +
6 minutes reading time (1162 words)

The Evolution of MIDI Software and Hardware in 2023


The MIDI association hosted a live roundtable discussion last week, featuring 2022 MIDI Award winners Krishna Chetan (Pitch Innovations), Henrik Langer (Instrument of Things), Markus Ruh (MusiKraken), and John Dingley (Digigurdy). Amongst the winning products you'll find the Somi-1 MIDI controller, a motion-sensor wearable that converts users' body movements into MIDI data. MusiKraken's MIDI controller construction kit similarly tracks your hands, face, voice and device rotations. 

The warm reception toward these mixed reality music products underscores a greater trend towards immersion, novelty, and disruption that's persisted into 2023. 

The MIDI medium is the message

Changes to the way we create music can have an impact on the types of music we produce.

Media theorist Marshall McLuhan famously said that the medium is the message, meaning that our tools reveal something about our collective mindset toward the problems at hand. Conventional MIDI keyboard designs are inherited from the classical piano, for example, and with them comes the assumption that musicians should press buttons and keys to make music. Next-generation controllers like MusiKraken and Somi-1 reimagine the controller landscape by turning human bodies into expressive instruments.

Streaming platforms like Spotify and Apple Music offer a second example of unconscious bias that comes baked into our technology. Artists are required to release songs with a fixed duration, arranged into singles, EPs and albums. This model for music is inherited from legacy formats in the recording industry. As a result, they exclude modern formats like adaptive music and may be limiting our pace of musical innovation.

YouTube differs from other music streaming platforms with their support of continuous music in a 24/7 streaming format. Extreme AI artists Dadabots took advantage of this opportunity by publishing a number of infinite music videos, like the infinite metal example shown below. WarpSound offers adaptive AI music experiences on YouTube, empowering fans to can cast votes and impact the music during their YouTube livestream. These kinds of experiments are only possible because the medium supports them. 

Toward more immersive music making experiences 

MIDI software is often experienced through an LCD screen, but that could soon change with the rising popularity of virtual and mixed reality hardware.

Earlier this month, Spatial Labs announced their upcoming AR DAW called Light Field. It's not available commercially but you can watch a demo of their prototype below. Like a laser keyboard, the interface is projected onto a hard surface and users can interact with UI elements to sequence beats, chop samples, and more.

Virtual reality music games and DAWs are another domain where music creation has evolved. Experiences like Virtuoso VR, LyraVR, Instrument Studio VR, SoundStage, SYNTHSPACE and Electronauts have the power to change our ideas about what a digital audio workstation should be and how we should interact with them. 


Artificial intelligence has had a major impact on the creative arts this year. The popularity of text-to-image generators has coincided with a parallel trend in MIDI software. AudioCipher published its third version of a text-to-MIDI generator this year. The app turns words into melodies and chord progressions based on parameters like key signature, chord extensions, and rhythm automation. You can watch a demo below.

The text-to-music trend has continued to gain traction this year. Riffusion paved the way for text-to-song in December 2022, with Google's MusicLm following suit in May 2023. Riffusion and MusicLM don't compose in MIDI. They generate low fidelity audio clips replete with sonic artifacts but they're nevertheless a step forward.

Most people hear AI Music and think of AI voice generators, due to the recent popularity of AI songs that imitate mainstream artists. An AI Drake song called Heart on my Sleeve reached more than 20,000,000 streams in April and May. United Music Group has made a public statement denouncing this practice.

Earlier today, rapper Ice Cube made a public statement calling AI music demonic and threatening to sue anyone who used his voice. Meanwhile, other artists like Grimes and Holly Herndon have sought to come up with models for consensual licensing of their voices.

So far, there has been very little discussion over the tens of millions of music clips used by Google to train MusicLM. As the owners of YouTube, Google has the right to train on the clips in their database. Many of these songs are protected by copyright and were uploaded by everyday users without the original artist's consent.

This intro to Google's AI music datasets outlines their training architecture in more detail and addresses some of the ethical concerns at play, as more companies seek to train on instrumental tracks to build their AI models. 


Digital audio workflows can have a steep learning curve for beginners, but artificial intelligence may soon remove that barrier to entry.

WavTool, a browser-based AI DAW, comes equipped with a GPT-4 assistant that takes actions on any element in the workstation. Users can summon the chatbot and ask it to add audio effects, build wavetables, and even compose MIDI ideas. A full demo of the software is shown below.

The AI assistant understands prescriptive commands like "add a new MIDI instrument track with a square wave".

Vague requests like "write a catchy melody" yield less satisfying results. In many instances, a prompt like that will generate a major scale that repeats twice. Rich descriptions like "write a syncopated melody comprised of quarter, eighth, and sixteenth notes in the key of C minor" deliver marginally better results.

The AI text-to-midi problem could eventually be solved by AI agents, a special class of text generation that breaks down an initial goal into a series of subtasks. During my own experiments with AutoGPT music, I found that the AI agent could reason its way through the necessary steps of composing, including quality-assurance checks along the way.

For an AI agent to actually be useful in this context, someone would need to develop the middleware to translate those logical steps into MIDI. WavTool is positioned to make these updates, but it would require a well-trained MIDI composition model that even the biggest tech teams at OpenAI's MuseNet and Google's Magenta Suite have not achieved to a satisfactory degree. 


For years, Melodyne has been the gold standard for monophonic audio-to-midi transcription. In June 2022, a free Spotify AI tool called Basic Pitch went live, delivering polyphonic audio-to-MIDI within a web browser.

A second company called SampLab has since delivered their own plugin and desktop app this year, with more features than Basic Pitch. Both tools are pushing updates to their code as recently as this month, indicating that improvements to polyphonic MIDI transcription will be ongoing through 2023.

Suffice to say that MIDI has remained highly relevant in the era of artificial intelligence. With so many innovations taking place, we're excited to see who comes out on top in this year's MIDI Innovation Awards! 

Stay Informed

When you subscribe to the blog, we will send you an e-mail when there are new updates on the site so you wouldn't miss them.

New Windows MIDI services Spring 2023 update
MIDI In China