20 Years of Perfect Pitch in Popular Music - How Elastic Audio Technology forced a Change in Aesthetics, Perception and Listener Expectation 

An Empirical Analysis Based on Quantitative Research


carried out and written by

Christian Gschneidner


Part II


Song specifics I: Medium electronic pop song in native language

When listening to these audio examples, people first were asked the obvious question: Which version do you prefer? After that they should choose which one the find most natural sounding and most emotional sounding. These evaluation questions for the modern pop track are offering a first general overview (figure 08). Listening to the three versions almost all examined participant groups with very few exceptions had a more or less clear majority vote out of the four possible answers (Original - Or, Melodyne - MD, Autotune - AT, no difference - nd): “I hear no difference." This is the most often chosen answer for the evaluation questions (figure 08).

One might expect that the quality of the audio system used for listening directly influences the numbers for “no difference”, but it does not. Mobile speaker-, external speaker-, headphones- and studio-listeners are equally undecided, all around 40% (e.g. mobile: 40,7%, studio: 39,7%). Peculiarities like mobile users more often choosing the original and external speaker users more often choosing Melodyne seem random. Studio and Phones users decide almost identically.

 It must be considered though, that the overall vote majority fundamentally changes, when numbers for Melodyne and Autotune versions are summed and viewed as one pitch correction group. If all Autotune choosers would have voted for one sole Melodyne version, or the other way round, will be observed later, while separately analyzing musical background groups. However, summed Melodyne and Autotune votes see pitch correction in the lead when people are asked which version they prefer (45,5%), ahead of not decided (37,3%, figure 08). Original (17,1%, figure 08) is equally far behind in all three categories, “preferred”, “most natural sounding” and “most emotional sounding”. Looking at the shares in the pitch correction group, Melodyne wins over Autotune in all three categories. It is interesting that the deliberately artificial sounding Autotune version does not perform worse than the Original in terms of natural sounding. This small detail cautiously might hint towards the direction of pitch correction becoming “common sense”. Mind that these numbers are derived from counting all participants and therefore are only seen as a first general overview.

Musical background specifics: Audio professionals vs. consumers

A more detailed view sees one particular group with a significantly low number for “no difference”: Audio engineers and music producers, i.e. professionals (30.7% of all participants, figure 05) are clearly more sensitive to version differences. Only 18,4% of them hear no difference at all (figure 09). And there are more peculiarities in this group: The Melodyne share is excessively large with 44,9% (figure 09), whereas Autotune and Original are about average in preference numbers (figure 09). Audio engineers and producers obviously are likely to prefer perfectly intonated vocals. However, looking at the version professionals dislike most, the Autotune number is extremely high (32,7%, figure 10) compared to the consumer group. This can be interpreted as a request for nicely tuned but natural sounding vocals in the professionals group - given a moderately electronic pop music context. Summarized, professionals are much more decided towards pitch correction and clearly favour Melodyne over Autotune.

The consumer group also clearly favors pitch corrected versions (38,5%) over the original (15,6%) in this musical context. But even more of them do not hear a difference at all (45,9%, figure 09). Furthermore 67% do not especially dislike one version (figure 10). Do these numbers mean they may well be alright with any of the vocal tracks? At least the shares don’t show a clear sole favorite (figure 09). Nevertheless, competitive producers might sum the Melodyne and Autotune numbers, cancel the 45,9% who do not hear a difference from the equation, and claim that they please 71,2% of the remaining listeners more, if they tune their vocals. This is a valid point of view. The idea of summing numbers for Melodyne and Autotune versions might well work, although it assumes that all choosers of these would have chosen the sole pitch corrected version if there was only one. This seems logical at least for the consumers group, because there is no explicit dislike against one of the tuned versions as figure 10 shows. The few who explicitly decided against one of the three versions even slightly less dislike the autotuned one (figure 10), strongly opposed to the professionals group.

As a conclusion the separated engineers- and consumers-numbers explain why small differences for many consumers still are a big difference for music producers - in a twofold sense: First, they perceive pitch more intensely as a preferable music quality. And second: At least while working on music, they not only look for personal enjoyment, but for a widely attractive creation. They are likely to welcome any small product improvement, especially on a saturated, competitive market.

The professionals group is over-representated in the survey, compared to their share in a music market's consumer group. In order to get a less distorted picture of the effect of pitch correction on potential buyers, one has to observe numbers for this group separately, and even count them out on some aspects regarding consumer experience. Numbers for music listeners and musicians are quite similar, with a slightly stronger tendency towards pitch correction on the musician's side (figure 11). It still should give objective views on some aspects, if they are summed to one consumer group and juxtaposed to the professionals group.

Song specifics II: Classic hit record with modern vocal tuning

In order to get a broader overview from this survey, a second audio example was implemented: Michael Jackson’s classic tune Billie Jean (named MJ in figures). This may seem an odd decision at first, but-pre-tests gave interesting indications for using it. A significant number of eight participants in the professionals group did not take part in all three MJ evaluation questions. It may be assumed they did not believe, that there really can be differences in the vocal take of a song mixed 37 years ago - they left that part of the survey out. Still there were 42 of them answering.

The counted “I hear no difference” answers of all participants in the preference category is roughly similar to that of the current pop song. Obviously the audible tuning differences between the three versions were equally distinctive in both songs - as intended. A difference to the modern pop song lies in larger preference numbers for the original vocal. This was expected for the two above mentioned reasons: There is a strong familiarity with the original track and Jackson sings in an extremely expressive non-Autotune-suitable singing style. It may also be argued that Jackson simply is the better singer of the two and therefore impresses listeners stronger with his natural voice.

BUT: What really strikes is the fact that a majority of the participants who hear a difference nonetheless prefer one of the two pitch corrected versions (51,8%, derived from figure 12). Even when professionals are canceled from this table, there are still 50% of the decided, who are more pleased with a tuned Michael Jackson, than with the original (derived from figure 12). 

Passing reference: During the survey period an incidence was reported of a young die hard Jackson fan, who was very aware of the presented differences and still decided for the new Melodyne version.

One might assume that the audio professionals more likely detect the original familiar voice, but even if it was so: Astonishing 56,7% of the professionals who hear a difference favor pitch correction. Audio-pro’s Melodyne preference is even more distinctive here than on the modern pop song (derived from figure 13). Opposed to consumers, professionals seem to be quite aware of the stylistic inconsistency an autotuned Michael Jackson impersonates. Thus 82,4% of the professionals preferring pitch correction decide for Melodyne, only 17,6% for the pronounced Autotune sound (derived from figure 13). This can be interpreted as a decision for a cleanly tuned Michael Jackson, but without the tuning artifacts, that some call the vocal sound of the new millennium. 

Loved and loathed in equal measure, Autotune has become one of pop’s most recognizable and well-used audio processing effects. For many, it is the sound of the 2000s. (Nick Prior, Popular Music Digital Technology and Society)

 Autotune's distinct artefacts though seem to have a much more pleasing sound for average consumers. Again the consumer group share, deciding for pitch correction, slightly votes the other way round. On both singers they favor the typical radio-like sounding Autotune version over the softer tuned Melodyne take.

Summing up: 30,1% of all participants voting for pitch corrected versions (figure 12) may not be all too impressive at first sight. But the picture completely changes when comparing this number to the number of votes for the original version (28%, figure 12) and taking into consideration, that this original is a 37-year-old power rotating classic pop song including a historic world class vocal performance by one of the biggest pop artists of all times. Nobody can seriously say this vocal had to be tuned, because of inferior singing abilities. It must be the tone and style that gets pitch correction its votes. Pre-test impressions were actually confirmed by the survey. The risky choice that has been made by offering this song pays off with valuable insights. The smaller absolute numbers for pitch correction on this second audio example seem nonetheless even more convincing than the far bigger numbers for the more obvious pitch correction candidate. What sticks here is that a majority of the listeners who heard a difference preferred a new pitch corrected version over the version some of them have been hearing for decades.