What Will Happen With AI Music?
Deep Dives
Explore related topics with these Wikipedia articles, rewritten for enjoyable reading:
-
Vocaloid
15 min read
The article directly references Vocaloid as a precursor technology to AI-generated vocals, noting that Suno relies on 'tech similar to what has been powering Vocaloid since the turn of the millennium.' Understanding Vocaloid's history and how it synthesizes singing voices provides essential context for comprehending the evolution of AI music generation.
-
Marley Marl
12 min read
The article uses Marley Marl as a historical example of how audio quality limitations didn't prevent musical innovation, specifically referencing his pioneering work loading drum samples into early samplers with limited memory. His story illustrates the broader point about technology constraints and creativity that the article is making about AI music.
-
Google DeepMind
2 min read
The article mentions that Google's Magenta research team merged with DeepMind, which is central to understanding the corporate and research trajectory of AI music development. DeepMind's broader AI research provides important context for how music generation fits into larger artificial intelligence capabilities.
A long time ago, I became interested in the work of a group within Google called Magenta, who were trying to apply machine learning and neural networks to music production. To this end, they created plug-ins for Ableton Live and an open-source hardware instrument called the NSynth Super that you could build yourself according to specifications that Magenta made available online. That was about seven years ago, now. Two years ago, the research team that Magenta was part of merged with DeepMind and presumably became part of Google’s current AI efforts.
The NSynth Super is kind of a fascinating artifact in retrospect. I was very interested in it back then, but I don’t know if I really comprehended the implications of what it was doing. I remember the sounds it made being very fuzzy. They sometimes resembled familiar instruments, but it sounded as if you were hearing them through a kind of audible fog or haze. Turning the dials to try to lock in on a sound I liked reminded me very much of sitting in front of a CRT TV as a child, moving an antenna around trying to pick up a signal that would turn the fuzzy distortion on the screen into entertainment.
Today, I hear that same fuzz in the outputs of AI music platforms like Suno. Along with the robotic quirks of the vocal performances these platforms generate, the fuzziness is one of the most obvious signs that a song has been AI-generated, at least if you know what to listen for. There’s just a lot less of it now. The picture behind the fuzz is a lot clearer, to the point where you kind of forget it’s there if the content behind it is engaging enough.
Suno outputs frequently sound tinny, stilted, bitcrushed, and compressed within an inch of their lives, but so does a lot of the music that people make with DAWs and home recording equipment. When Marley Marl first started loading drum hits from old records he liked into a sampler to program his own beats with them, the audio quality of the samples needed to be very poor in order to fit them into the sampler’s limited memory. He still made classics. When producers abandoned hardware samplers and specialized studio equipment for computers at the turn of the millenium, popular music as a whole started to become palpably cold, rigid,
...This excerpt is provided for preview purposes. Full article content is available on the original publication.

