Tips on producing songs with Udio

[Check out this post on my personal page, where it looks better]

Some months ago a revolutionary AI tool came out: Udio. It allows you to produce professional-sounding songs. Although I know how to play the guitar, I’ve always been, as a systems builder, more interested in putting songs together than learning how to play an instrument, and I also rarely enjoy interacting with people, so dealing with human musicians was out of the question. Udio has allowed me to come up with about seventy-five songs, so at this point I think I’m qualified to give tips on this subject.

I only start thinking about the musical side of things when I have the lyrics ready. They tend to change very little during production: mostly to make them sound better or rhyme, if the opportunity arises. I also add little touches like laughs, comments, and vocalizations like “aah,” “yeah,” and such, which tend to make the song sound more natural.

As far as I’m concerned, the lyrics don’t need to be elaborate. I mostly focus on sentences that transmit a particular emotion. I admire complex, very carefully-written lyrics like Joanna Newsom’s, but they wouldn’t work for the kind of songs I’ve wanted to make so far.

Once the lyrics seem ready, I pinpoint the stanza that will determine the general style of the entire song. It’s usually the chorus (I don’t write multi-chorus songs, so that’s easier to determine for me), or at least the part of the song that needs to be nailed to fit your mental image. Udio uses structural tags to help the AI determine your intention: [hook], [chorus], [verse], [bridge], and such. I don’t think I have ever started a song with a segment that wasn’t a [hook] or a [chorus].

Apart from structural tags, Udio’s AI was trained with loads of “mood” tags. I have collected as many as I could, which is an ongoing process, and I have relied on ChatGPT to classify them. For example, under “musical qualities” and “abstract” I have the following to choose from: “cryptic, complex, existential, dense, glitch, abstract, generative music, improvisation, mashup, eclectic, lobit, microtonal, minimalistic, sampling, silence, sparse, tone poem, uncommon time signatures”. All these tags are functional, and manipulate the generation in appropriate ways.

I go through all these mood tags and, using the same seed for the generations, I produce some to get a feel for what I’d like the final song to sound like. More often than not, I don’t know what general genre the song will fall in. I base my choices on what my subconscious likes; an “I’ll know it when I see it” situation.

Once I’ve determined the mood of that particular segment, I go through my collection of instrument clips that I have painstakingly amassed from YouTube videos. Some time ago, I read through online lists of all the instruments in the world, then I determined which had matching tags in Udio. While pre-producing a song, I listen to each of those instruments one by one and let my subconscious decide if it would fit any of the stanzas. It’s a very painstaking process that usually takes about two hours, but it pays off in the end: the songs I have come up with would have been far less interesting otherwise.

Once I’m happy with the distribution of instruments, I go through a massive collection of genres, plenty of them bizarre (like psychobilly and cowpunk, two of my newly-discovered favorites), and ask Udio to generate loads of clips. If the style of an initial generation impresses me, I tag its name with its genre. If any of the generations is good enough that I would have gladly produced a whole song out of it, I mark it as “[name of song], Pt. 1 candidate.” If I end up with more than one candidate, but I’d rather discard them all but one, I pick the best, then I remix it by adding on top of it other genres whose associated generations had impressed me. That’s how I ended up with a mix of dance punk, surf rock, and cajun in “Paleontology of Pain.”

The best source I’ve found to learn more about genres is the fantastic site musicmap.info. You can zoom in on every supergenre, figure out how most genres relate to others, and listen to songs in those genres.

Once I’ve determined the best seed generation, always 33 seconds-long, the real fun starts: I extend that segment in both directions to render the rest of the lyrics. I keep prompt strength at 70% (forcing Udio to mostly obey my prompt, but giving it some room for improvisation), lyrics strength at 35% (it sounds more natural, allowing the singer to repeat some words or hallucinate as Udio sees fit), and generation quality obviously at ultra.

The context length is extremely important: the AI will only rely on what you allow it to see when deciding how to style the new generation, so don’t include in the context a part of the song that you wouldn’t want to “tint” the extension you’re working on.

Along the way, you may love some generation except for a few seconds where the singer blurted out gibberish, some instrument could have sounded better, etc. That’s where inpainting comes in: it patches over those parts without altering the rest of the song. Note, though: inpainting in general sounds worse than full generations, particularly the drums. No idea if that’s something that the team behind Udio will be able to improve, so if you can trim the part of the song you would have inpainted and request a full generation instead, do that.

When I’m happy with the full song, I download its wav file and open it in Audacity. Udio often screws up the sound levels, so I mess with them in Audacity until I’m happy with how the entire song sounds. Sometimes I screw it up myself and have to “remasterize” them because I have inadvertently produced clicks, which was particularly noticeable in the version I uploaded of “Synaptic Flies.” Editing a song easily takes up to an hour, or an hour and a half.

That’s about it. You can check out my albums here. I have two of them ready, and in a few days I’ll upload the third volume of Odes to My Triceratops. I hope you have learned something from my obsessive attention to detail, in case you’re into this bizarre business of putting together AI-generated music. And if you read this far even though you weren’t interested, don’t you have better things to do with your time?
 •  0 comments  •  flag
Share on Twitter
Published on June 18, 2024 04:29 Tags: advice, ai, art, artificial-intelligence, lyrics, music, songs
No comments have been added yet.