Saturday, 3 September 2016

Investigating Loudness Normalisation in Spotify and Youtube

Lately I have been trying to bus compress my tracks to get them as loud as commercially released songs. This is quite frustrating as the amount of limiting required is bad for the sound. This got me thinking about the loudness war/loudness race and I had a look into loudness normalisation systems in Spotify and YouTube to see just how loud I really need to make my tracks... It's actually very interesting so I'll post what I found here.

The Loudness Race

Loudness is used specifically and precisely for the listener’s perception. Loudness is much more difficult to represent in a metering system. (…) Two piecing of music that measure the same on a flat level meter can have drastically different loudness (Katz, 2007, pg. 66).

According to Katz (2007, pg. 168), when two identical sounds are played back at slightly different levels, the louder version tends to sound ‘better’, although this is only a short term phenomenon. This has lead engineers to compete for loudness to attract attention in the jukebox/on the radio when the playback level is fixed, which is achieved through the use of compression and limiting on the master bus. Unfortunately, this can lead to fatiguing and eventually unpleasant sounding records, or in the extreme case, audible clipping, as in the famous example of Metallica’s “Death Magnetic” album (Michaels, 2008). The loudness race has also been present in television (with advertisements competing for audience attention) and radio (with stations competing for listeners to choose their station) (Robjohns, 2014).

Loudness Normalisation

The first attempt to implement some kind of regulation on loudness was the International Telecommunications Union’s (ITU) published standard, ITU-R BS. 1770, which in 2006 recommends a loudness metering algorithm to act as “electronic ears” to control the playback level of television advertisements. The algorithm uses a 400ms integration time, a two-stage filter to simulate the frequency response of the head and ears, and creates an average reading for the entire duration of the track (see figure 1). A gate is used to filter out quieter sections, so that they do not affect the reading (ITU, 2015). 

The ITU standard has been put to use in television, so that users do not need to reach for the volume control when changing channel or in between programs. Once the loudness of a program has been measured, a static level change can be implemented if it is too loud or too quiet, in order to reach a specific ‘target loudness level’, which for television is recommended to be -23 LUFS (loudness units relative to full scale) (Robjohns, 2014).

Katz (2007, pg. 172) predicts and end to the loudness race as consumer listening moves on from the compact disc to iTunes, whose “Sound Check” feature allows for the normalisation of the loudness  of audio tracks when in shuffle mode, removing any uncomfortable jumps in loudness when listening to music from different eras or genres. This processing is also applied to iTunes Radio. According to Robjohns (2014), iTunes normalises to a reading of around -16 LUFS using an algorithm that appears to be broadly similar to ITU-R BS 1770, although the details of Apple’s method are not publicly available. Robjohns suggests that this target loudness may be too low, as some devices restrict the maximum listening level to due loudness concerns. This means that users may not be able to compensate for the reduction in level. 

Other music streaming services have also started adopting loudness normalisation algorithms. For example, Spotify has the option to “Set the same volume level for all tracks”, which by default is switched on, however according to Shepherd (2015c), the reference level is too high; to keep a consistent level, Spotify adds extra limiting which can pump and crunch in an unpleasant way. YouTube has also implemented a volume matching system for music, which according to the findings of Shepherd (2015b) does not use ITU-R BS 1770 as the readings fluctuate between around -12 LUFS and -14 LUFS.

Figure 1. An outline of the ITU-R BS 1770 algorithm.

Procedure

To measure how YouTube and Spotify are normalising loudness, I used a program called SoundFlower to route the output of my MacBook in to Logic X and out again, where I could load Isotope’s “Insight” plugin for metering, which helpfully has a preset for measuring loudness using the ITU standard. Also shown are true peak values, which will show how much each track has been turned down.
Figure 2. Screen grab of the measurement process


SPOTIFY 

Track
Loudness
Peak Level
Notes
Closer by The Chainsmokers
-12 LUFS
-2 dBFS
Electronic/Pop (2016)
Didn’t Mind by Kent Jones
-10 LUFS
0 dBFS
Electronic/Pop (2016)
Final Song by Mø
-12 LUFS
0 dBFS
Electronic/Pop (2016)
Sunshine by Radio Edit
-10 LUFS
-1.1 dBFS
Electronic/Pop (2016)
That Was Just Your Life by Metallica
-12 LUFS
-6 dBFS
From “Death Magnetic”
Claire de Lune by Debussy
-17 LUFS
0 dBFS
Classical piano
Spotify advert
-6 LUFS
+1 dBFS

Don’t Let Me Down - Zomboy Remix
-12 LUFS
-3 dBFS
Dubstep

Spotify’s system appears to be working, keeping everything between -10 and -12 LUFS bar a few exceptions. The first five tracks were taken from a “New Music” playlist; these tracks were released this year. It is interesting to see that two of these tracks peaked at 0 dBFS, implying that Spotify did not turn them down at all; they were mastered to hit this target loudness. “Closer” and “Sunshine” were only turned down by 2 dBs and 1.1 dBs respectively, meaning they were not significantly louder. The Metallica track was taken from the notorious “Death Magnetic” album which whose over-compression brought the issues of the Loudness Race in to the public awareness. This track needed to be reduced by 6 dB in order to hit the loudness target.

To really test out the system, I also played a piece of quiet piano music (Claire de Lune by Debussy) which gave a reading of -17 dBFS. This falls below the desired loudness, but in practice it didn’t sound out of place; you kind of expect piano music to be a bit quieter. This seems to go against the claims of Shepherd (2015c) who claims that Spotify applied its own limiting to dynamic music. Either this is not the case, or certain types of music are exempt. It could be that classical music has its own loudness target that is lower than for pop music genres, due to the lack of bus compression used in classical music.

What let the system down was the exemption of adverts from the normalisation; from Debussy to Spotify’s advertisement for an electronic music playlist was a jump of about 11 LUFS which was  very unpleasant. 

YOUTUBE 

Track Loudness Peak Level Notes
This Is What You Came For by Calvin Harris -14 LUFS -4 dBFS Electronic/Pop (2016)
Closer by The Chainsmokers -15 LUFS -4 dBFS Electronic/Pop (2016)
YouTube advertisement for Sky TV -12 LUFS -4 dBFS Electronic/Pop (2016)
Heathens by twenty one pilots -13.5 LUFS -3 dBFS Indie/Pop (2016)
The Day That Never Comes by Metallica -14 LUFS -4.5 dBFS From “Death Magnetic”
YouTube advert -13 LUFS -4 dBFS

Rachmaninov - Piano Concerto No. 2 -17 LUFS 0 dBFS Classical piano with Orchestra

YouTube appears to aim for a lower loudness level than Spotify. YouTube music seems to fall in the range between -13 and -15 LUFS, which agrees with the findings of Shepherd (2015b) who quotes between -12 and -14 LUFS. The advantage of choosing a lower target is that there is now a smaller difference when you switch from pop music to uncompressed classical music. YouTube advertisements are louder, at around -13 to -14 LUFS, but the difference is not as much as Spotify. This gives a much better listening experience. 

Conclusion

The findings show that Spotify’s system aims for a loudness of between -10 and -12 LUFS. Limiting a track any further than this is would be pointless as it would just be turned down. The Metallica track had very audible clipping when played side-by-side with other tracks at a similar loudness. For release on YouTube, it would be better to aim for a level of between -13 and -15 LUFS to maximise the dynamic range in the track. For release on both mediums, I feel it would be best to aim for Spotify’s playback level, otherwise the track may sound quiet compared with other tracks, or subject to Spotify’s automatic limiter, if there is such a thing.

It is hoped that these systems will reduce the commercial pressure on mastering engineers to over-compress music at the expense of sound quality. In radio, there is still no published standard yet (Robjohns, 2014) and this is an important step in finally removing all demand for hyper-compression in mastering.

References

ITU, (2015). BS.1770: Algorithms to measure audio programme loudness and true-peak audio level. [online] Itu.int. Available at: https://www.itu.int/rec/R-REC-BS.1770/en [Accessed 17 July 2016].

Katz, R. (2007). Mastering audio. Amsterdam: Elsevier/Focal Press.

Michaels, S. (2008). Metallica album sounds better on Guitar Hero videogame. [online] the Guardian. Available at: https://www.theguardian.com/music/2008/sep/17/metallica.guitar.hero.loudness.war [Accessed 17 Jul. 2016].

Robjohns, H. (2014). The End Of The Loudness War?. [online] Soundonsound.com. Available at: http://www.soundonsound.com/techniques/end-loudness-war [Accessed 17 July 2016].

Shephard, I. (2015). YouTube loudness normalisation - The Good, The Questions and The Problem - Production Advice. [online] Production Advice. Available at: http://productionadvice.co.uk/youtube-loudness-normalisation-details/ [Accessed 17 July 2016].

Shephard, I. (2015b). YouTube just put the final nail in the Loudness War's coffin - Production Advice. [online] Production Advice. Available at: http://productionadvice.co.uk/youtube-loudness/ [Accessed 17 July 2016].

Shephard, I. (2015c). Why Spotify's 'set the same volume level for all tracks' option is back, and why it matters - Production Advice. [online] Production Advice. Available at: http://productionadvice.co.uk/spotify-same-volume-setting/ [Accessed 17 July 2016].

Sunday, 21 August 2016

New video is out


Finally finished the mix for this song. The full version will be on the album/portfolio and it features a more complex orchestration, with drums, guitar, bass, piano, cello, oboe and flute - which have been now edited and are waiting to be mixed. However, this video is part of the promotional strategy: posting on Youtube acoustic versions of the tracks that will be available for purchase in full version later on. This also means that, when I put out a compilation of 'best of' acoustic performances, I'll be able to essentially sell the same song twice.

The video was finished a while ago, but I had trouble with the mix because:
1) piano was muddy and had lots of noises on it (pedal and chair squeaks)
2) vocals were very loud on all piano tracks, therefore pitch correction was very, very hard to do without the sound getting all phasy.

The first problem was fixed by using Izotope De-crackler to get rid of (some) noises, and by a lot of EQ-ing and making up for the cut out frequencies with the Waves Exciter.

The second problem was a bit more difficult to tackle, and, funnily enough, where both Auto Tune and Melodyne (!) failed, Logic X's Pitch Correction managed to fix the pitchy notes without being extremely obvious. I would have never expected it, but it comes to show that sometimes some plugins might serve certain purposes better than others due to their different algorithms.

Friday, 12 August 2016

Band Recording

Today I've recorded a band. First recording of a full band I did at UH! I was a bit nervous, but going to Studio 1 and setting up everything yesterday saved a lot of hassle today. First of all, I had time to think over the placement of the band and to spend time positioning the mics. I recorded everything together, except for the vocals which were overdubbed later.

The problems started early in the day, when one of the signal from one of the overheads was not coming through. I switched the cables, tried a different input etc - no result. I was afraid the mic was broken, but when I switched them over, the other one wasn't working either. It turned out that some of the desk inputs weren't working. Later on, when I couldn't get the DIs to work, I realised what the problem was: the phantom power must be broken on some of the channels. KM184 and passive DIs didn't work on those respective channels, but an SM57 did. I was using Logic, so due to the setup I was limited to 16 channels - 12 counting out the broken channels.

I decided to ditch the DIs and only mic the amps and gave up the snare bottom to free up some channels. In the end, the setup was this:

1.Kick: Shure Beta52
2.Snare SM57
3.Hi Tom MD421
4.Mid Tom MD421
5.Lo Tom MD421
6.OHL KM184
7.OHR KM184
8. bass: DI
9. guitar 1 mic: SM57
10. guitar 2 mic: MD421
11. vox - Shure SM58 for the 'scratch' vocal, U87 for dubbed vocals

I have managed to get a surprisingly good sound! I don't think ever managed to get such a good sound for a band before, especially for the guitars. 
We recorded 2 songs. The first one went a bit slow, because I tried to get the band to play to a click, which they were not used to before. There were no vocals either, so they kept messing up the structure. With a bit of practice, they managed to stay on the click and I think they delivered a reasonably tight performance.
For the second one, I decided to give up the click and go for a more live feel. I also gave the singer an SM58 to sing in and put the vocals in the fold back. The result was a much more heartfelt performance and we finished the song in probably the quarter of the time we took for the first one. 
The next step was quickly choosing the best take with the band and overdubbing the vocals for the first song. Listening back to the second one, I realised that there were absolutely no bleed whatsoever from the scratch vocals, so muted them and overdubbed them with the much nicer U87. 
Due to how I set up the room, the instruments were pretty well isolated, even though they all played at the same time. I made a booth for the drums at the back of the room by using two panels, and placed the guitar amp in the opposite corner of the room, facing the treated wall, with the back to the drums and an acoustic panel behind it. The other guitar amp was on the other side of the same panel, but facing away from the other amp towards the booth of the drums. The bass was DI-ed. The vocalist faced the other corner of the room, next to the bass player, away from the guitar amps and the drums.

After recording vocals, we had enough time to play around with backing vocals, a whistle solo and some group clapping. Not sure how much of that will make it on the final recording, but we surely had fun!

Here is a link to the band's music: https://soundcloud.com/themarras1-1/almost-old

Tuesday, 9 August 2016

The 'Lana del Rey' vocal sound

I've been thinking of trying to emulate Lana del Rey's vocal sound and see if it might work on the vocals for 'Let Go', so after a quick google search, I came across this article, which I found to be a very interesting read - I never knew the album version of 'Video Games' was actually the demo version. I never really noticed that the piano was sampled or listened carefully to the strings and harp to realise they were also sampled, and by using a IK Multimedia Miroslav Philarmonique, which I own and I always thought was inferior to East West libraries - which to my ears still sound quite 'fake'. This is why, for my songs, I've tried to use, as much as possible real instruments, and double the sampled strings with real violin, or create the illusion of an ensemble by layering many takes of the one violin.

Listening back to 'Video Games', the orchestral instruments  indeed sound quite 'MIDI', but I can tell there was a lot of effort into drawing in articulations. It's a simple arrangement, but work well and it was tastefully done - and it's encouraging to think that a track created in this way had such as massive exposure and made it on the radio.

I found this excerpt particularly interesting in relation to the Lana del Rey's signature vocal sound: 'The vocals were bussed to an aux track and we had several plug-ins on the bus. In addition to the high-pass EQ, we had a Waves de-esser to knock out some of the pops and hisses, and a Waves compressor with a light ratio but a heavy threshold, and then another compressor, the H-Comp, with a higher ratio. I like to run multiple compressors in a signal chain, with none of them compressing too heavily. It does not 'small' the sound as heavily as one maxed out compressor. After that, we had an H-Delay and an RVerb, just set to a hall reverb, mixed in really high but with the bottom and top end rolled off, so the main reverb you were getting was mid-range.'

The funny this here is that they didn't do anything particularly special to achieve that sound, and that is actually the plugins chain I normally use for processing my vocals. 



Here is the link to the article:
http://www.soundonsound.com/people/robopop-producing-lana-del-reys-videogames


Friday, 5 August 2016

AV Session at UH (B01)

Along with working on my own music, I've been trying lately to put together some recordings and video for a future showreel that will hopefully go onto the 'Portfolio' section of my website and maybe will help get me some work recording or filming other people.

All the video experience I had before was from shooting my own videos (but, of course, someone else was filming me) and later on, editing them in Final Cut X. I find editing video very satisfying and I usually have a good eye for choosing angles and making people look good on camera, so I thought it would be a good opportunity to make some videos for other people using the facilities at UH. After getting trained on the C100 Cannon cameras at UH, I was ready to go.

This week I had two AV sessions, one solo piano and one for a singer-songwriter (vocals and electric guitar). I recorded the audio using the Soundcraft Vi2 desk in B03 and left the recording going throughout the several takes while I was in the performance space, manning the cameras. I used two cameras at the same time, one static, one moving and changed angles after each take. The audio was not recorded to a click, so the editing will be a bit challenging, but I have done such editing before for my own songs, and even though matching the audio and video takes takes more effort, I found that it's worth the effort to retain the 'live' feel of the performance.

For the piano session I used 2 pairs of KM184s (condenser, cardioid) - one pair for the spots (one mic for the low strings, one for the high strings) and one pair in an ORTF configuration (on a stereo bar, 17cm distance between the mics, angled away from each other at about 110 degrees) to pick up a more distant, roomy sound.
















For the second session I used an SM57 on the guitar amp and also recorded the DI-ed signal. For the vocals I used an U87 and, as a special request from the artist, an SM58 (he wanted the recording to resemble his live performances in which he alternates between two mics - one of them having a lot of reverb on it to create a special effect). I resolved to only use the signal recorded through the U87 but add reverb on the relevant bits.




Sunday, 31 July 2016

Recording Brass for 'If Silence' and 'Falling Star'

A couple of weeks ago I recorded a small ensemble of brass players for two of the tracks that will be on 'City of My Mind', the album I'm working on - and part of my MSc portfolio.

The ensemble was formed by three trumpets, four saxes and one trombone. Every two instruments were mic'ed using AKG C414s and U87s.

The arrangement was written in Logic using sample instruments and then exported as a score.

Here is a video from the session, the chorus of 'If Silence':