20 Years of Perfect Pitch in Popular Music - How Elastic Audio Technology forced a Change in Aesthetics, Perception and Listener Expectation 

An Empirical Analysis Based on Quantitative Research


carried out and written by

Christian Gschneidner


Part III


Evaluation specifics: Most natural vs. most emotional

Aside from mere preference there were two particular vocal qualities to be looked at by additional evaluation questions: Listeners were asked which vocal versions they found most natural and which one most emotional. These questions were included to investigate how the different strengths of pitch correction influence listener's perception of these two much desired vocal production qualities. Experts state that Autotune has introduced a highly artificial sound quality to pop radio. Music author Greg Milner writes in his book Perfecting Sound Forever in 2009:

…If you’re listening to a song on a station that plays new or fairly recent rock, pop or R&B, it will likely sound even weirder because there is a very strong chance that the vocals were processed with Autotune, which can create a very inhuman droning effect. (Greg Milner, Perfecting Sound Forever)

Would this be confirmed in a blind A/B/C-test, and if so what role does listening experience and musical background play?

As already analyzed above, the professionals group for both songs is in favour of the mellower pitch correction versions done with Melodyne. Detailed views now reveal that the melodyned vocal in the tested pop song is perceived “most emotional” by audio pros. Surprisingly twice as much of them find it even more natural sounding than the original vocal track (figure 14).

For both song examples the original’s share of votes in the category “most emotional” is significantly lower, than in the category “most natural”. Whereas the share for Melodyne stays the same and the Autotune share even grows in the “most emotional” category (figure 14). The old music producer saying “sloppy intonation kills emotion” seems confirmed by these numbers. That alone may be not a huge surprise, but it seems astonishing that the Autotune share doubles from “most natural” to “most emotional” for the classic tune, and grows by 30% for the pop song. Does this somewhat confirm the seemingly contradictory idea, that the outer-worldly digital character of heavier Autotune settings can give vocals an added emotional depth? 

…Andrew Goodwin argued that “we have grown used to connecting machines and funkiness.” That maxim could be updated for the Auto-Tune/ Melodyne era: “We have grown used to connecting machines and soulfulness.”…general public has adapted to hearing overtly processed voices as the sound of lust, longing, and loneliness. (Simon Reynolds, How Auto-Tune Revolutionized the Sound of Popular Music)

Consumers group data show this phenomenon for the classic hit record, too: Autotune share grows from “most natural” to “most emotional”- a bit more lightly though, whereas the original's share noticeably shrinks (figure 15). As stated before the absolute numbers for this song example must be somewhat distorted by its nature, but the planned relative maths confirm the examined trend. The current pop song has much higher numbers for pitch correction again. The particular trend though cannot be seen in this consumers group. (figure 15).


All in all three out of four examinations see larger "most natural" shares than "most emotional" shares for the original vocal. The asked for adjective “most emotional” seems not necessarily bound to less digital sounding. On the whole, tuned vocals are perceived more emotional than the untuned tracks. Consumers' perception of emotionality seems less massively dependent from perfect pitch than professional’s, but still double the consumers find the corrected vocal versions of the pop song more emotional than the original (figure 15).

As much as three times more consumers perceive pitch manipulated vocals more natural sounding than the original take (figure 15). And also consumers plus professionals couldn’t be further from finding pitch correction per se sounding artificial: Of all participants asked for the most natural vocal track, 43,1% for the pitch corrected versions stand opposite to 15,2% for the original vocal (figure 08). The consumers group even finds the radio-like Autotune version more natural than the original (!, figure 15). Eleven years after his writing Milner seems disproved: Autotune does not seem to be perceived “weird” in this 2020 survey. For the current pop song the verdict in terms of most emotional - and surprisingly also in terms of most natural sounding - is as clear as in terms of preference: Superhuman intonation wins both categories with lengths. It appears that constant propagation of perfectly intonated vocals in popular music has changed its aesthetics and therefore many consumer's perception and expectations. 

…Its (Autotune’s) normative use has had pervasive consequences for accepted norms with regard to the sonic attributes of the voice in pop production. (Robert Strachan, Sonic Technologies, Digital Culture and the Creative Process)

This definitely seems true for manually edited pitch perfection - and partly even for the citated radio-compatible automatic correction, achieved with Antares' classic Autotune application. 


Target group specifics I: Favored genre

So far the survey's numbers indicate that the basic question the study examines might by now be answered with: "Yes, perfect pitch, achieved by software applications, is generally taken for granted in popular music." Many find it naturally sounding and adding to a more emotional vocal performance. The question more or less seems to narrow down to how much of it should be used for which target group. So far it was found out that there are differences in perception depending on musical background. It became clear that audio professionals perceive pitch per se stronger than ordinary consumers and even sense the amount and sound of it more intensely. But nobody would say that audio pros are his predominant music product target group. Thus, a complete survey has to be able to define more meaningful participant groups. Listeners had to assign themselves to one of three age groups: under 30, 30 to 50 and over 50. They were asked for their preferred genre and their premier listening medium (radio, CDs and vinyl, streaming). For the analysis the genres Current Pop/ Charts and Hip Hop/ R’n’B will be joined as “Charts / Urban”. Pop from 60ies to 90ies, Country Music, Blues/ Jazz and Classic Music will be joined as “Adult Contemporary”. Rock / Metal and Indie/ Alternative will be called “Rock / Alternative”. Issues to be enlightened are:

  • Is the chosen pitch correction intensity target group specific in terms of age, preferred genre and premier listening medium?
  • Is there an apparent development from old to young?

A look at the three favored genre groups shows the familiar picture for all three tables: Pitch correction dominates. Its share is largest in the AC group, followed by the Urban/ Charts group and the Rock/ Alternative group (figure 16). One might have expected the Charts group in the lead here, but results are tight. Nonetheless there is the expected clear lead for this group,  when looking closer at the Autotune shares alone, instead of summing Autotune and Melodyne. 33,3% like this version most, 20,3% in the AC group and 15,4 % in the Rock group (figure 16). 50% of all who hear a difference in the Charts/ Urban group prefer Autotune, only 21,4% decide for Melodyne. In the other two genre groups the Melodyne share is larger than the Autotune share - but only when all participants contribute: When professionals are canceled, the Autotune share is larger or equally large in all three groups (!). Yet Charts and Urban listeners are obviously most attracted by digital sounding vocals. And they are awarely so: The closer look at the Urban/ Charts group is one of the rare instances where one of the three vocal alternatives alone has the same amount of votes as “I hear no difference”. After more than 15 years of continuous autotune use, Hip Hop protagonists like heavy Autotune adopter T Pain and pop stars like Katy Perry and Rihanna clearly left their marks here. This may be called a fat underline under the testimony of Autotune being the sound of modern pop radio.

Target group specifics II: Premier listening medium

Autotune in pop Radio directly leads to looking at the question for premier listening medium. Participants had three choices: Streaming, CD/ Vinyl and Radio. Professionals are left out of these tables because they would distort the picture being very unequally divided into the three medium groups with their very particular preferences: They almost don't use radio. The tables at very first sight offer an interesting small side information: Radio listeners are by far least likely to hear differences in the vocal takes. 66,7% of them are not decided at all (figure 17). Radio surely isn’t primarily meant to be used for listening carefully to musical details in the first place. Radio listeners seem to be casual music listeners. But does that mean one can offer radio consumers any of the three takes? 66.6% of the ones deciding for one version choose pitch correction - though the all in all small numbers may have restricted meaningfulness.


The rest, CD/ Vinyl and Streaming users, mainly differentiate in the amount of Autotune votes. Streaming users embrace the sound of Autotune with 28,1% (50% of them who heard a difference; figure 17). Streamers are the youngest medium user group in this survey. 40,3% of them are under 30 years old. None of the 33 participating young people answered the question for the premier listening medium with CD/ Vinyl, only two with Radio. Radio listeners are 51+ in their majority - only 11,1% decide for Autotune. CD/ Vinyl users are the middle agers here. The tables indicate that the younger the target group, the larger the Autotune share of votes.

Target group specifics III: Age

The final survey question for participant’s age should further enlighten this assumption. A development in perception and listener expectation can surely be best assessed when observing preferences in dependance of age. Participants were asked to assign themselves to one of three age groups. First there are the under 30 years old. They were eight years when Cher’s Believe hit number one in 1999 and ignited an Autotune era. This is the generation that grew up with autotuned Pop star voices. The second group are the 30 to 50 year old. They were in their teens before that turning point, but spent most of their life, while superhumanly intonating singers were all over pop Radio. And then there is the 51+ group. Their most imprinting music memories happened before digitally achieved singing perfection.

The preference vs. age specifics for the pop song in the consumers group finally present the most unambiguous picture: 37,9 % of the young prefer the Autotune version, only 17,2% the actually released Melodyne version (figure 18). This is the clearest win for it in all participant groups and categories. In addition to this only 24,1% are undecided (figure 18). This is a clear decision for digital perfection and Autotune timbre from the young. The middle aged consumers also clearly favor Autotune in this musical context. Only difference is that twice as many in this age group don't hear the differences. In sum 33,3% of the under-51-years-old consumers decide for Autotune, 15,3% for Melodyne and only 13,9% for the untreated take (figure 18). 50% of them who hear a difference in the young and middle aged consumers groups decide for much heavier pitch correction than the actual song release had applied (!!, derived from figure 18). Given the native language of the song and a comparatively organic musical context these numbers are quite astonishing and seem very relevant for production decisions.


When looking at the 51+ generation again, there is a fundamental difference: Apart of a larger number not aware of the differences, only 9,5% decide for Autotune. Half as much as for the winner in this age group: Melodyne (figure 18). Comparing the younger groups with the oldest group drastically seems to make obvious that something has changed over the years in vocal perception and expectations. It actually looks like digital sounding perfection is the new way to go, when producing pop music - and not only for the very youngest target groups.