Reviewing & Parsing Audio

: Written by Karl Sherlock

While it can sometimes become as tedious as any other component of the post-investigation process, audio review is among the most rewarding phases of an investigation. Not only does it tend to produce proportionally more results than any other data set, it generally offers the most intriguing findings. In fact, most investigators will cite their best EVP as their most persuasive "evidence" of paranormal activity. This is because the voices that emerge from out of that an unseen realm, so fallibly human and familiar to us, invariably appeal to our pathos in ways cold spots and orb photos could never hope to....

Team_Resources--Reviewing_Audio.pdf

This article offers practical guidelines to review your audio using a methodology that best produces credible findings. Because some of it can sound a little jargony, it's probably best to begin with a review of the terminology referenced in these guidelines.

Amplify	To increase (or decrease, as in “de-amplify”) the actual decibel level of audio in a track, whether whole or in part, as opposed to changing volume.
AVP	Audible “Voice” Phenomenon, an anomalous voice or other sound heard by one or more people at the time it was captured in the recording.
Channel	A channel represents a direction of sound input; the more channels, the more directions and dimensions of sound.
CTX	Context, a sound file containing no anomalies; used for documentation or for the peer review process of other evidence. (For example, if someone else captures an EVP, and you isolate an audio clip of the same event from your own audio to prove it was really a jacket zipper, not an EVP, then your audio clip isn’t considered an EVP; it’s a CTX clip, instead.)
Decibel	A unit of measurable loudness.
Earphones	Earbuds, in-ear monitors, canal phones.
EVP	Electronic “Voice” Phenomenon, an anomaly “captured” on the recorded medium but not perceived in real time during the recording; typically spoken but can also refer to non-spoken anomalies as well—singing, whistling, moans, music, etc.
Event of Interest	A potential anomaly and its context.
File Naming Protocols	A method of file naming that categorically identifies the type, nature, and content of a media file.
Filter	A sound effect or sound adjustment to the sound file, and not just to the listener's experience of it.
Gain	An adjustment in the software, itself, that affects the input volume of the entire track. (Yes, it’s also a popular laundry detergent.)
Headphones	On-ear or over-the ear headsets.
Matrixing	Auditory pareidolia, generally a result of apophenia.
Source Audio	The raw, unfiltered, unedited audio “data” file.
Track	The collected waveforms that make up a single span of audio; a.k.a. channel.
Volume	A setting that increases or decreases output loudness of headphones or speakers, changing the listening experience only, not the source; see “Gain”
Waveform	A graphical representation of a pattern of sound expressed in amplitude and duration, typically made available in audio editing software.

Prep and Tech

If you prepped yourself for audio review based on those tv shows where ghost hunting is a gladiatorial sport, you might think it’s all about gadgetry and sound engineering—as though investing in the right editing programs and converting your tool shed to a sound studio were all you needed to become an EVP black belt.

Yes, a little minor audio filtering can help sometimes, but the truth of the matter is, if you haven’t already obtained clean, high definition audio, it’s already almost too late to do anything about it. Better recording equipment is always the key to better EVP, but, not coincidentally, it yields far fewer quality EVP, whereas lesser quality recording equipment produces far more throwaway EVP matrixed out of artifacts and white noise.

Unless you’re the kind of person who has money to waste, you should save up your pennies to purchase the best multitrack digital recorder you can afford. Conversely, you should spend zero money on sound editing equipment and computer programs—certainly not for the sake of finding EVP. Free sound editing programs such as Audacity are widely available on the internet for just about any operating system you run, and they come pre-loaded with equalizers, tricks, and filters, the majority of which you won’t want to futz with anyway. I’ll say more about this in a while.

Should I review audio while I’m recording it?
If you’re a newcomer to this game, then, definitely not. In most cases, because of environmental noise, you’ll never be in a setting where you can rely upon a stop-and-go assessment of your audio. More advanced investigators might employ Bluetooth devices that permit them to listen to their recorded audio in real time (or with a minor delay). However, it doesn’t allow for any leeway to stop, “rewind,” and review, much less isolate and export audio clips of interest. For that, you need a review session after the investigation when you can couch yourself with your recorded audio and really study it. Furthermore, most of the learning curve for mastering audio review happens in this post-investigation phase.

Headphones or speakers?
The listening end of the audio review process, though—that can be another matter. One of the most common questions posed about paranormal audio review is, “What’s the best way to listen?” Is it better to listen for anomalous voices through external speakers, or are earphones the way to go? If the latter, are earbuds or headphones preferred? On-ear or over-the-ear? Wired or wireless? Radio frequency, infrared, or Bluetooth? To noise cancel, or not to noise cancel? And so on.

The short answer, though, is simply this: always use headphones for audio review, but use external speakers and your own set of ears to confirm and classify EVPs. The acoustic “environment” created by a set of headphones can leave a false impression of the relative strength of an EVP, as well as encourage matrixing (auditory apophenia or “pareidolia”).

Which should I use?
Whether earphones or headphones, you should use whatever helps you to hear more. You can translate that, however, in several ways.

More hours of audio
Because audio review is a time-consuming process, you should select the type of listening experience that’s going to be most comfortable for you—for hours at a time, if necessary. Fit issues, signal interference from nearby devices, and factors such as eyeglasses, ear piercings, heat and perspiration might make for unforeseen problems. Furthermore, not all Bluetooth headphones operate smoothly with repeated pausing and restarting of playback.

More range of hearing
Choose an ear- or headphone option that allows for a better range of sound, offers greater definition of sound, and, if you used multitrack HD recording devices, makes optimal use of recording equipment. Another rule of thumb: consider what works well with your own ears’ hearing limitations.

More authentic sound
And you don’t always want to use headphones that change the sound quality for you. Just because a set of headphones enhances the music listening experience, this doesn’t mean it will equally enhance your search for EVP. I’m referring here mainly to noise canceling headphones, which use a processor to determine the frequency of an ambient noise, then produce a sound through the headphones that is 180 degrees out of phase, which effectively "erases" or cancels the noise. This means, not only are the headphones introducing another false sound while you’re trying to analyze your audio data, it’s also taking away a range of sound where EVP might actually be embedded. Generally speaking, if it comes down to whether you struggle to hear anything above the clamoring din of your own listening environment or to experience the fulsome frequencies of your audio minus some white noise, choose the devil you know. (Find more on-line about "Noise Canceling Heaphones" at How Stuff Works.)

How Do I Examine My Source Audio?

You’ll need to open the following three software programs on the laptop or computer where you’ll be working; get ready to switch back and forth among them:

sound editing software, such as Audacity;
a calculator (yes, obviously, you can use an actual calculator instead);
a word processing program in which to open and record data into a media log, preferred over a handwritten log because you can share it with others on-line. (See “Media Review Logs” for a sample and a detailed rationale.)

Before starting, make a copy (a backup) of your source audio and squirrel away the original. That way, if you make a mistake, or something disastrous happens, you can quickly recover the original file without having to go into your computer backups. Name the duplicate file using proper protocols. (See “File Naming Protocols” for detailed information about how and why you should name your files in a particular way, including the source audio you’re about to analyze.)

Open your file
Open or import your duplicate source audio file into a sound editing program, such as Audacity. 44,000 is the optimal rate to analyze, usually the software's default; less than this, it will be hissy and “staticky.” The software should generate a waveform of your file upon fully opening.

Start listening
Assuming that you’re now comfortably wearing your phones and keeping an audio log at hand, you can now start listening to your audio.

Don’t try to rush it—least of all by speeding up the tracks! Just divide up the job across twenty- to thirty-minute listening sessions, so that you don’t get listening fatigue. Sometimes it’s easy to get caught up in the conversations, or let your attention wander to something else going on in your room. The best strategy for optimal concentration is the same one you would use to read a book leisurely.

Keep your ears tuned for anomalies—anything out of the ordinary—but be ready for “out of the ordinary” to be nothing more than “exceptionally ordinary” most of the time. What constitutes an “anomaly” in the audio is hard to pin down. With practice, you’ll recognize the features of anomalous audio more readily. In general, however, an anomaly is indicated by 1) an unusual, or unrecognized voice; 2) statements and/or sounds that seem “out of place” or “out of character”; 3) jabbering, snickering, whistling, etc.; and, 4) a voice whose “acoustic qualities” are markedly different from others in the audio.

Analyzing multiple source files
Gathering more than one source of audio from an investigation is fairly typical. Hardboiled investigators will drop mics in multiple areas, sometimes even outside, so as to obtain a more spatially complete “picture” of the acoustic environment. It doesn’t matter if all the recording devices are exactly the same make and model, what sounds on one mic like an adult male voice saying, “Bring it!” could on another mic be the clear and unequivocal barking of dog. When different recording devices concur, however, that an EVP candidate has been captured, then three or more sources of audio spread out could help to triangulate the source of the EVP.

Analyzing multiple tracks of source audio at once is not only time saving, it allows you to compare and contrast the same events in multiple sources to determine their potential worth as electronic voice phenomena. Beginners might wish to wait until they’re more confident with the process, but intermediate and advanced users might benefit from setting up multiple tracks aligned by a common real-time start time. This is where it gets tricky, though. Working across multiple audio tracks, you have to align waveforms exactly right to make it work, and it's not just a matter of sync-ing them all to the same time cue: typically, one track is already in progress when another begins or concludes. To make matters even more complicated, there are minute variations in recording speeds from one recording device to another, so even perfectly aligned tracks can go out of phase after a while, forcing you to do a little nip-and-tuck cosmetic surgery to bring one or all back into alignment. Like I said, it’s not for beginners.

However, if you do employ this method, make sure that the different tracks are clearly and correctly labeled to distinguish them from one another, and don’t forget to record the names of the source files in your audio review log! Faithfully document the time cues where each source audio begins, which is will aid you in alignment and in calculating timestamps. (See below.)

I Think I Might Have Found Something. Now what?

When you stumble upon an event of interest, isolate it, parse it, and copy it to a new window, modify its amplitude if necessary, document it and its details in your log, and export the clip as a stand-alone sound file.

Isolating and Parsing:
When people are taken out of context, their complaint is that something said has been isolated in a way that unfairly alters its meaning or insinuates another meaning. This is also a common beginner’s mistake in isolating events of interest. Lifting anomalies of out their context is not only reductive, it encourages bias and misinterpretation. Take this simplistic example: someone has isolated a gruff sound seeming to say, “Jump!” and presented it to you as a half-second audio clip. You’re intrigued, and ask for the source audio. When you isolate the event of interest in source audio, you discover that what your friend thinks is an EVP is actually the last of three coughs from the back of the room. Without the broader “story” to put it in context, the sound is insinuated to be an EVP. The very act of isolating it made it mean something different than what it really is.

Obviously, most investigators come by such mistakes honestly. After all, listening to EVP is a subjective experience fraught with expectation bias and a penchant for matrixing, even if the listener actually participated in the original EVP session. Our sizable investment of time and energy subconsciously drives us to find something, anything, of value in the audio, so occasionally we mishear or ignore the context. You’re already familiar with the phenomenon. It’s officially called a “mondegreen,” but it’s better known as “misheard lyrics”: the real song lyrics might be “you think you’re gonna break up / then she says she wants to make up,” but a sexually frustrated, acne prone teenager will hear, “you think you’re gonna to break out / then she says she wants to make out.”

To avoid taking events of interest out of context, you should make it a habit to isolate the event of interest and the surrounding activity or verbal exchange, even if that means your clip ends up including other distracting noises. On average, a one-syllable EVP will be contextualized by an eight-second audio clip. (That’s not a rule, just an average.) However, the other advantage to presenting events of interest in this way is that they can be heard sometimes as contextual responses to investigator questions. If an investigator has asked, “What do you dislike on your pizza?” and the event of interest seems to be the word, “Anchovies!” then you’ve probably captured a contextual response, which increases the likelihood of it being an EVP—that is, once you can vet it and cross-check it against other sources. (If you happen to be investigating a pizza parlor, then that, too, provides a background context for the event of interest.)

Filtering and editing:
Once you’ve opened another page and copied the audio clip to it, it, too, should display as a single track consisting of one or more waveforms. Export it immediately as an unaltered version, using proper file naming protocols. Never save an audio clip with alterations as your only version of it, since this will disqualify it from ever being a reliable source.

For the same reason it's bad to pluck an event of interest out of context, enhancing only the anomaly is also ill advised. On the one hand, because most EVP are so brief, the listener is likely not to cognize the enhancement; if anything, it'll sound less clear because the brain simply requires the buffer of a few seconds before and after in order to process the alteration. On the other hand, enhancing only the anomaly and not the event, itself, is not altogether above board; it prevents a true and accurate understanding of the anomaly from being disclosed to the listener. Consider those dodgy tv shows that try to get you to see city complexes and gun turrets on the moon: they keep showing the one picture with the outline overlay over and over again and never let you see the image for yourself, as it really is. You’re only sensible option is to change the channel. That same thing is going to happen to your EVP. Filtering just the anomaly and not the context is like a manipulative outline that never goes away. Whatever you do to enhance the anomaly, do it to some part of its context as well. Remember, it’s not just the anomaly, but, rather, the anomaly and its context that is the “event of interest.”

As a hard and fast rule, you do not want to over-filter an audio clip either. By “over-filter,” I mean, applying software effects intended as enhancements but which actually turn out to be embellishments. This takes your “evidence” further and further away from veracity by magnifying imperfections in the original source, as well as by creating new artifacts of sound misidentified as hidden speech. The primary offenders: excessive noise reduction and single band equalizer boosts. It’s acceptable to amplify a little, which doesn’t do anything more but increase its loudness, and just a little touch of bass boost when needed can help the hard of hearing tap into low frequency anomalies. Beyond these, you should consider every other alteration off limits.

For each set of enhancements you make to a clip, you should export it as another version of it and denote this in the name of the file, as well as in your log. For simple amplifications and de-amplifications, record the +/- change in decibels in your log, as well. Among other virtues, this shows that you value transparency in your evidence review.

Calculate a timestamp:
Real-time timestamps are among the most important identifying features of any sound clip because they permit others to search for the corresponding event in their own source audio, making cross-checking and peer review possible. They also help to sync video footage, pics, and other data (such as EMF spikes, cold spots, and background radiation drops) announced during the investigation.

To calculate the timestamp, establish a base start time in real time; it should either be already timestamped by the digital recording device or announced by someone in the audio who is synchronizing the start of the investigative session.

Then, note your editing software’s time cue at the point where the isolated event of interest begins. Add those minutes and seconds to the base start time. This is the timestamp of your audio clip. Here’s an example:

If 21:04:00 (9:04PM) is the start time of your source audio (or you’ve trimmed the track to start at that time), then it officially correlates to a time cue of 00.00.00.000 (hh:mm:ss:mss).
An event of interest that begins 1 hour 14 minutes and 37.587 seconds into the source audio track would then correlate to the time cue, 01.14.37.587.
Calculate: 210400 + 011437.587 = 221838 (rounded up).
“221838” is the timestamp you would used in your audio log and in the file name according to File Naming Protocols.

It’s pretty straightforward, but never hesitate to rely on your calculator if you need to. It pays later down the line if you double-check your math now. Also, if you or your teammates have announced the time periodically in the recording, this will help you to calibrate your time cues to more accurate timestamps along the way.

What Comes Next? Peer Review and False Positives

Now that you’ve isolated events of interest as potential paranormal evidence, you may want to assign a classification to them. (See "Classifying Isolated Audio.") But, how do you know if what you've found is “real”? Where do you go to test its veracity and impact? If it's an event of interest to you, is it as much of interest to anyone else? As with other things in life, you get a little help from your friends. Cross-checking your findings against others is the best way to vet them, corroborate them, or simply get a second and third opinion. Those who rely only on themselves to substantiate their findings frequently stumble into the pitfall of false positives.

A false positive is a result incorrectly indicating that some condition or evidence is present—frankly, it’s a euphemism for “bogus evidence.” The following are the most common causes of false positives that, with a diligently completed Surveillance Log and an honest peer review process, can be quickly and reliably weeded out:

matrixing	Also known as “auditory pareidolia,” matrixing means, finding voices within static, white noise, or random sounds; related to apophenia, a cognitive phenomenon wherein one discerns ordered patterns within random data or stimuli.
bodily noises	Gastric sounds; nose whistling; swallowing; sniffles and sighs.
clothing	Nylon jackets; squeaky shoes or schlepping; pocket contents.
environmental sounds	Dog barks; passing cars; planes; rodents in the walls or under the floors; etc.
handling noise	Touching mic or brushing up against the recorder.
mechanical noises	Internal device noise; ticking clocks; spring clips and ratchets; snaps; etc.
mic and software artifacts	Pitch-modulation; directional shifting; over-filtering.
other investigators	Speaking very softly or whispering during the investigation.
vocal chord phenomena	Speaking in an unusual body position or with a craned neck.