Preparing the audio

In this lesson you will learn

which audio file to choose,
how to prepare it for Ultrastar purposes,
how to align off vocal intrumental version if there is one,
how to create vocal only version from normal & off vocal one.

Preparing the audio

Audio is the basis for vocal track adaptation. Before you start the mapping process, you should already have a trimmed high-quality audio file.

Which version of the audio file should I use?

The most important thing is to use a high-quality audio file, at least 128 kb/s. Poor audio quality will butcher the song.

Especially beware of ripping audio from unofficial videos from e.g. Youtube. Not only it could have been altered to avoid detection by copy-right protection (e.g. sped up or pitch slightly changed), but it may also have poor quality. In theory it will have 128 kb/s, but you will clearly hear the difference in quality between that version and the version that was originally released.

Similarly ripping the song directly from the movie (if it was part of the soundtrack) or from the video from the concert should be really treated as the last resort.

Simply try to obtain the original mp3 file.

What should I do once I have the file?

Open the file in some audio editor e.g. in Audacity (it is free). We will do 2 things:

trim excess silence from the beginning and end,
normalize the volume.

It can look e.g. like this:

Normalize volume

It's best to always do this step, even if the audio look OK.

Select the whole track and go to menu Effect -> Amplify.

Do not modify anything, just click OK.

If the audio has really inconsistent volume redo this step with custom amplification.

It may happen that your audio volume is not consistent - there is some small part with an effect that is very loud, as a result you may get sth like this after the first amplification.

In such case do Amplify... again:

By default Audacity will suggest 0, you will need to play a bit with the value, but let's bring the volume a bit, in our case e.g. 3.0:

The loud part got a bit clipped, but the rest of the audio is now on a decent volume level.

In general you should not need to do this. Usually automatic amplification gives the desired result.

Do the manual amplification only if you have some really loud effect that appears from time to time and because of it the actual song is too quiet.

Never clip the singable parts of the song!

Trim the silence from the end

Let's see the end of the song:

In this case there is 8 seconds of silence which is too much - Ultrastar waits for the audio to finish before showing scores. Let's trim it:

Leave at least 1 second after the last note finishes.

It happens very rarely but your song may end with a singable syllable, with no intrumental outro after it. In such case try to leave about 1 second of audio after the last syllable finishes (pad with silence if necessary). Otherwise the player may have the feeling that the game exited too soon and that some note got cut.

Trim the silence from the beginning

Let's now check the beginning of the song:

In this case we have about 5 seconds of silence before the song starts which is too much.

Before we remove anything, let's check whether the song begins from some instrumental intro or right away from singable vocals.

In case of this song there is an instrumental intro, so we can cut almost all silence out:

If your song starts straightaway from vocal part, remember to leave about 2 seconds of silence at the beginning to give the player time to realize that he is supposed to start singing right away.

Export the audio

Finally export the project as mp3 file. You can use standard quality for optimum balance between quality and file size.

Since you already have the mp3 file loaded in Audacity, you can use it to locate the start of the first syllable - GAP value.

Off vocal audio

If you want to sing true karaoke, you can try to find the original off vocal version of the song. Sometimes they are released together with the normal version as part of one album. It is very important to obtain an original file. Any other version will have slightly different timing and you won't be able to just reuse your TXT file.

To use off vocal audio, just finish the song with normal audio first. Once it's done, you have to aling off vocal version to the normal version.

Aligning off vocal and normal versions

To align both version, open the off vocal version in Audacity. Let's first normalize it as usual - amplify and trim silence from the beginning and end.

Once you are done import the normal version as the second track (you can simply drag & drop the other file):

Zoom in to the beginning of the song. You can mute the lower track temporarily to gray it out so that you will remember to only modify the upper track with off vocal version:

Now you have to align both tracks, you do it only by modifying the upper track!

Remember that the TXT you have prepared is aligned to the lower file so you can't modify it at this point.

Repeat the steps:

roughly align 2 tracks,
zoom in.

Zoom in and align again:

Repeat the steps:

Try to align the files as closely as possible, use some characteristic part for reference:

Once you are done, unmute the lower track, activate the playback and verify that everything is in sync, adjust if necessary.

Now delete the lower track and export the upper as mp3.

Using both versions

You now have 2 audio versions which you can place in the same song directory, e.g. add [OFF] suffix to the instrumental version to differentiate them.

The sad thing is that it seems that no Ultrastar distributions supports 2 audio versions for one song. So you can either:

change the value of the tag #MP3 in txt file to the version that you want to sing this time
copy the txt file, add e.g. [OFF] to the filename and change the #MP3 tag value to the off version of the audio file. You can also add [OFF] to the song's title so that you can easily find such versions later
duplicate the whole song directory and simply replace the mp3 file in one of them (will take more disk space due to duplicated video).

Choose the approach you like best.

Creating vocal only version

If you are so lucky to have original off vocal version then after you have aligned the two versions as closely as possible (in this case it's best to adjust them perfectly) you can easily create a third - vocal only version.

Load both into Audacity and select the track with off-vocal version.

Go to Effect -> Invert.

Now when you activate the playback you should hear mostly vocals, with the instrumentals removed.

If the result is not good, it means that you either did not use the original off-vocal version or that you need to align the tracks better.

If you are happy with the results, select both tracks and go to menu Tracks -> Mix -> Mix and render.

You now have a one mixed track, which is exactly aligned with the other 2. Export it to a separate mp3 file. You can use it while setting pitches, just like we have done in the tutorial.

What if there is no official off vocal version?

You can try to use some AI tool that will extract vocals and instrumental tracks from the file. There are several available. The upside of using this approach is that most likely the resulting files will already be exactly aligned with the original one so you wouldn't need to align them manually which is tedious.

To learn more refer to:

Setting pitches Enhancing vocal in mp3 file

PreviousCustom song NextPreparing the video

Last updated 1 year ago