10 February 2009

Audio signal processing, or "On T-Pain & Kanye West..."

Hi - 

Have you heard Kanye West's "Love Lockdown," an entrancing bit of intriguingly spare but heavily-engineered instrumentation?  Like many 2000s hip-hop songs, it features a vocal effect attributed to pitch-correction software - Auto-tune - from Antares Audio Technologies. [1]

According to a note in XXLmag.com (ref. 1),  we're supposed to understand that the pitch-locking vocal effect is not strictly a plain vocoder, but the proprietary Antares technology. Perhaps it is described by U.S. Pat 5,973,252 [Google Patents], "Pitch detection and intonation apparatus and method."  (Side note: Antares's Andy Hildebrand was a former seismic 3-D data analyst!  Small world, signal processing...)

Commenters on XXLmag.com offer examples of Auto-tune extreme-use (e.g. T-Pain), as opposed to traditional vocoder effects, but, frankly, I can't recall enough from my grad-school speech synthesis days regarding what the differences are between a vocoder, a phase vocoder, and Auto-Tune.  Ah, well.  I know a few of you reading this do indeed know the difference, so please chime in.

Before I leave you with a few YouTube clips, I want to explore the orchestration of "Love Lockdown" because it is a relatively minimal presentation of a theme for Top-40 radio.

First, the core tracks are available on Kanye's website, on the page "Love Lock Down Stems." They include a capella and (quite) distorted, guitar-esque vocals, a 4-beat 808 drum track (||: 1&2 3 ... :||), African percussion, piano (tutorial here, though I don't know if it's in the right key [YouTube]: C#m7, F#m7, B), and outro synth.

The African percussion sounds like a straightforward 1-bar loop which, for a moment there, makes me wonder if it's actually 4 bars in some subtle way.  Anyhow, omitting the lead-in bar crossing and the quarter-note clap:

 1e&a 2e&a 3e&a 4e&a 1e&a 2e&a 3e&a 4e&a
 x xx x x  x xx x xx x xx x x  x xx x xx

 1e&a 2e&a 3e&a 4e&a 1e&a 2e&a 3e&a 4e&a
 x xx x x  x xx x xx x xx x x  x xx x xx

From the listener's point of view, four tracks are active during most of the song: the 808, vocals, piano, and African percussion.  During the first half of the song, they layer (much like standard electronic "house music," i.e. kick, snare, bass, synth, black & white movie samples... :-):

iTunes length 808 voc pno Afrc
0:00   4M      x
0:07   16M     x   x
0:40   8M      x   x   x
0:55   8M      x   x   x   x
1:13   16M     x   x   x

Frankly I got a bit more into it than that, mapping out the song:

Song structure

Then, dear reader, around 10pm I started to wonder why on earth I was putting so much effort into this when I could be, I don't know, doing just about anything else.  I think my intrigue with the song's simplicity became... a spreadsheet.

Anyhow, here are a few YouTube clips for you to consider:

[embedding disabled]
(Video. Kanye West, "Love Lockdown" - commercial version)

(Kanye West, "Love Lockdown" - an evidently leaked SNL version)

[embedding disabled]
(T-Pain and Lonely Island / SNL digital short: "I'm On a Boat")

[embedding disabled]
(Kraftwerk, "The Robots")

(Video. See starting @ 3:00 - poking fun at how poor of a singer one can be & still sound in-tune-ish)

(A plain vocoder sounds quite different: YouTube clip)

Finally, being a New Jerseyite, I must leave you with Bon Jovi's "Living On a Prayer."  No idea what this effect is.



Anonymous said...

You have interesting hobbies my friend...I just play a little World of Warcraft to relax and go to bed. I tip my cap to you. ;-)

P.S. The word verification thingy here is "disca"...


Anthony said...

Cool stuff!

A vocoder breaks speech down into excitation pulses (from your diaphragm) driving a filter bank (the response of your vocal tract). For music applications, some signal, like one from a keyboard, is used in place of the excitation pulses to drive your vocal tract. The response of your vocal tract varies with time, and this time-frequency profile is what phase vocoders fiddle with. Auto-Tune actually changes the pitch, which is one (important) characteristic of the vocal tract response. Bon Jovi uses the talk box effect, which is
something else.