Hearing with Templates

An Email from Alexander MacRae
Originally published in the Winter 2007 AA-EVP NewsJournal
©Alexander MacRae – All Rights Reserved

Alexander MacRae speaking at the 2006 AA-EVP Conference

I have currently been writing something I titled, “Hearing with Templates” … For some years now, I have tried to deal only with the best obtainable EVP samples, disposing of the rest. I am aware that funny things can happen and I have attributed these to the very important subject of cueing errors. Working on the Bial Foundation project has forced me to take account of ALL samples recorded, the good, the bad and the downright appalling.

I was rather concerned lately to find that some of the samples I had selected seem to have changed completely while I was working on them. Taking a few days to do something else and then coming back to them, I found I was reporting some of them as something other than the original. Was this a matter of a time effect or a processing effect or what?

Some weeks ago I had sent out two of my local group to a couple of sites to do some recording and then taught one of them a little bit about analysis using Cool Edit Pro (now known as Adobe Audition). The other, Helen, a very perceptive person, asked almost immediately how it was that you could hear one thing at one time but then you could hear something quite different at another time—convinced then that the second version was the correct one. I mentioned cueing and tried to make it all seem quite normal. Earlier than all that, Edgar Müller had remarked in an email that different noise reduction levels could alter the meaning of what one heard. I did some experiments to investigate this point using normal voice and good EVP, which I will later refer to as “A-type” EVP.

My article on hearing with templates makes the point that what we hear is not necessarily the same as what we are listening to. And then the point is made that templates are used in all recognition processes, whether recognizing phonemes (elements of words); or patterns of phonemes which are words; or patterns of words which are phrases.

What you actually “hear” is the template. You can also hear all the other noises that are part of what you are listening to, but what you actually “hear” is the template that best fits the sound pattern.

If you listen to a sequence of phonemes that you have never heard before, for instance, “Gelarumipalat,” which is not a word in the languages that you understand, which does not have Latin, Greek or Germanic roots, what you will hear is a sequence of phonemes, pure and simple. If you listen to a recognized sequence of phonemes such as “angry,” you hear a word. And if you listen to a sequence of known words in a recognized sequence such as, “I am so angry!” what you “hear” is a meaning.

What you listen to and what you hear can be different things.

There has to be a distinction, therefore, between EVP that is so good it is close to normal speech in good listening conditions, we will call that A-type EVP; and EVP that is not that good, we will call that B-type EVP. They are both EVP but they have different behavioral characteristics.

With B-type EVP,

  • different people may hear different things;
  • what is heard using headphones may be different from what is heard using a speaker;
  • what is heard when one is told what it is, may be different from what one heard before being told what it is; and,
  • what one hears at one time may be different from what one hears at another time.

To the general public this PROVES that EVP is NOT real. Therefore, one should not expose the general public to B-type EVP.

Remember that normal hearing is also dependent on template-based pattern recognition.

The received wisdom over the years was that EVP is deficient in the relative energy level of consonants as compared with vowels, and as consonants are the main carriers of intelligibility, so EVP is less comprehensible. I went along with this explanation unexamined myself and even repeated it.

The world experts in this are in the Department of Phonetics and Linguistics at University College London (UCL). The UCL people have been looking into the speech of people suffering from deafness or some neural/motor deficiency. This speech has consonants that are low or missing, thus reducing its intelligibility. Just like EVP one would suppose.

Let us make up an example. Suppose we have a stroke victim saying, “How are you now?” They might say, “OOOAAOW … AAARGH … EEE-UUU …. NNN N … …AAAAOOOOW….” Almost entirely vowel sounds, and very slowly. Where muscular dexterity is required as in the rapid transition from “n” to “ow” in “now” then there is a delay. However, this is not what EVP sounds like. The problem may not be the consonant/vowel energy ratio.

The UCL people have looked into cueing as an important factor in intelligibility, and they developed a method of manual cue enhancement in a recording. They tried this and indeed it improved intelligibility. However, automatic cue enhancement did not work.

Cues are taken as the regions of transition; the region where one vowel changes into another or into a consonant or the impulse and blank period on which consonant sounds are based. The reason for this is that the significance of a set of consecutive sounds depends on the sequencing, on the timing and so relative position in time. Cue is very important.

Here we should also note that the term “phoneme” is not entirely accurate. For example, when each is isolated out, the “a” at the beginning of the word “attack” sounds quite different from the “a” in the middle of the word. To describe this feature, the word “phoneme” is replaced by the word “allophone;” that is, a phoneme taking into account its phonetic environment.

Timing is crucially important, and just as you can have people who have trouble with the spatial sequencing of a written word, who are “dyslexic,” so also there seems to be a tendency for some to be “dyslexic” in terms of time sequences. Remember that in an audible communication system, the listener is also part of that system.

For some time, my opinion was that EVP was perhaps cue-deficient. My thinking now is that Type-B EVP has an over-supply of cues, and that due to the relative uncertainty or randomness involved in the EVP process, fortuitous transitions appear which can be taken as false cues, enabling more than one interpretation to be found.

Where more than one interpretation is found this does not mean that a correct interpretation does not exist. Although if two interpretations exist then both must be wrong is normal thinking. But that is not necessarily the case.

All sequence-significant hearing is template-based but some patterns are so uncertain that more than one template can seem to fit.

[Note that this explanation is not providing a reason to think that “B-type” EVP change in any way. Editor]

Leave a Reply