Transcription of a Recording - Factors Which Influence How Long Will it Take
It's generally accepted that we speak four times faster than we can type and seven times faster than we can write.
The professional industry standard (taken from the Industry Production Standards Guide 1998 published by OBC) allows one hour to transcribe 15 minutes of clearly recorded speech.
It therefore follows that it takes a minimum of 4 hours to transcribe a one hour recording.
It's in the interests of both the transcriber and the client to deal with recordings of the highest possible quality.
No transcriber enjoys working with poor quality recordings, and if a client invests time, money and effort arranging an event, why scupper it at the recording stage? A poor recording will result in a high number of 'inaudibles' and take far longer to transcribe, which in turn increases costs.
Producing a good quality, clearly audible recording is vital.
If it can't be heard, it can't be transcribed! Before you record an interview or an event, talk to a transcription company.
They can usually offer advice and guidelines on how to organise a successful recording.
The choice of recording equipment and the facilitation of the event have as much an impact on the final cost and the accuracy of the transcript as does the skill of the transcriber.
It may also be useful to understand which factors influence the length of time a transcription will take.
Format and quality of the recording Digital recordings will always produce a clearer recording than analogue formats, such as standard audio tapes or micro cassettes.
Even minidiscs produce a much better recording than a tape.
Don't use a cheap recorder as this is a false economy.
The extra transcription costs involved will far outweigh any savings made on the equipment.
Is an external microphone used? If the internal microphone on the recorder is used to make a recording of anything other than dictation, the results will be poor.
External microphones are essential for capturing a clearly audible recording.
The position of the microphone is also key.
If it's too far away from the speaker (or speakers), much will be inaudible.
If there's only one microphone for a group discussion, this will clearly record only the nearest speaker's voice.
Microphones are relatively inexpensive and the price can be recouped several times over in reduced transcription costs.
Clarity and number of voices If the speaker's voice is hard to hear, either because they're too far away from the microphone, or they mumble, speak too fast or too quietly, the words will be difficult to decipher.
With recordings of focus groups, meetings or roundtable discussions, transcription can be more challenging due to the multiple voices involved.
Obviously, each voice has a different tone, pitch and speed, as well as accent.
A group moderator can make a tremendous difference to how audible a group recording will be if they can control the tendency for people to speak at once, interrupt each other, and sometimes prevent the whole event descending into a shouting match.
Do speakers need to be identified? With recordings of one-to-one interviews or small groups, identification of the speakers is straightforward.
In large focus groups or meetings where there may be a 'babble' of voices, this becomes more difficult and takes longer, especially if the transcriber has never heard those voices before.
Clients can help by providing a voice 'brief' or by asking speakers to identify themselves, either at the beginning or at intervals throughout the recording.
This will make it easier to match names to voices and help the transcriber 'tune in' to the different voices.
How fast people talk It may sound obvious but if someone is a fast talker, they will take longer to transcribe than someone who speaks more slowly.
For example, take two recordings - both one hour in length.
The first speaker talks slowly, 'normally' and the resulting transcription is perhaps 10,000 words long.
The second talks at 'machine gun' speed and the transcript totals 16,000 words.
Same length of recording - completely different length of transcript.
Therefore, a fast talker produces more words.
More words equal more to type, which in turn equals more time taken.
Do they speak in coherent sentences? Not everyone does! Everyday speech is usually littered with verbal habits and quirks which we generally don't 'hear' in conversation.
People switch thought in mid-sentence, add unnecessary 'you knows' and 'sort ofs', or sometimes don't speak in coherent sentences at all.
We rarely speak in the same way as we write.
In such situations, the transcriber must work out where to insert the punctuation so as not to lose the thread of the whole piece.
The more coherent the speakers are, the less time it takes to transcribe their words.
The transcriber can 'type as they talk' and rarely needs to go back and puzzle out the meaning.
Level of background noise Background noise can make or break a recording, so choosing the recording location is vital, preferably a quiet indoor environment.
Our ears can filter out most of the extraneous noise which is constantly around us, from traffic noise, equipment interference, other voices to even the background hiss from the recorder itself.
Microphones are not so selective - they pick up every sound, giving each noise equal prominence (unless noise cancelling microphones are used).
How much 'jargon' is involved? Material which is full of technical or specialised terminology may be unfamiliar to the transcriber.
It may be necessary to re-listen several times in order to distinguish the words.
It helps enormously if a glossary of keywords or some kind of brief about the topic involved can be provided by the client in advance.
Transcription companies tend to use the following guidelines when advising clients on timings.
For transcription of dictation, interviews, lectures and podcasts, it can take between 4 to 6 hours to transcribe an hour's recording.
Focus groups, meetings, conference presentations, group interviews and teleconferences can take slightly longer, typically between 6 to 10 hours.
Transcription of video recordings, the insertion of time stamps, or the identification of focus group participants can take even longer, usually between 7 to 10 hours.
These timings all relate to Intelligent Verbatim transcripts, which are defined below.
The type of transcript required will also impact on timings and costs.
The most popular, quickest and cost effective choice is Intelligent Verbatim, which ensures an accurate transcript but omits the verbal habits and meaningless fillers which litter everyday speech.
These add nothing to the context of the transcript and take longer to transcribe.
However, there are instances, such as legal interviews or university research, where a Complete Verbatim transcript may be needed.
This captures absolutely everything said, including dialect patterns, conversational style and identifiable emotions.
This usually adds 1 to 3 hours to the transcription times mentioned above.
The third transcript format is Edited, which adds between 1 to 2 hours to the standard timings.
This is very useful for some focus groups, conferences and lectures where the content is critical but the style of the speaker is irrelevant.
Complete Verbatim and Edited Transcripts are more expensive than Intelligent Verbatim to reflect the extra time needed to transcribe.
The professional industry standard (taken from the Industry Production Standards Guide 1998 published by OBC) allows one hour to transcribe 15 minutes of clearly recorded speech.
It therefore follows that it takes a minimum of 4 hours to transcribe a one hour recording.
It's in the interests of both the transcriber and the client to deal with recordings of the highest possible quality.
No transcriber enjoys working with poor quality recordings, and if a client invests time, money and effort arranging an event, why scupper it at the recording stage? A poor recording will result in a high number of 'inaudibles' and take far longer to transcribe, which in turn increases costs.
Producing a good quality, clearly audible recording is vital.
If it can't be heard, it can't be transcribed! Before you record an interview or an event, talk to a transcription company.
They can usually offer advice and guidelines on how to organise a successful recording.
The choice of recording equipment and the facilitation of the event have as much an impact on the final cost and the accuracy of the transcript as does the skill of the transcriber.
It may also be useful to understand which factors influence the length of time a transcription will take.
Format and quality of the recording Digital recordings will always produce a clearer recording than analogue formats, such as standard audio tapes or micro cassettes.
Even minidiscs produce a much better recording than a tape.
Don't use a cheap recorder as this is a false economy.
The extra transcription costs involved will far outweigh any savings made on the equipment.
Is an external microphone used? If the internal microphone on the recorder is used to make a recording of anything other than dictation, the results will be poor.
External microphones are essential for capturing a clearly audible recording.
The position of the microphone is also key.
If it's too far away from the speaker (or speakers), much will be inaudible.
If there's only one microphone for a group discussion, this will clearly record only the nearest speaker's voice.
Microphones are relatively inexpensive and the price can be recouped several times over in reduced transcription costs.
Clarity and number of voices If the speaker's voice is hard to hear, either because they're too far away from the microphone, or they mumble, speak too fast or too quietly, the words will be difficult to decipher.
With recordings of focus groups, meetings or roundtable discussions, transcription can be more challenging due to the multiple voices involved.
Obviously, each voice has a different tone, pitch and speed, as well as accent.
A group moderator can make a tremendous difference to how audible a group recording will be if they can control the tendency for people to speak at once, interrupt each other, and sometimes prevent the whole event descending into a shouting match.
Do speakers need to be identified? With recordings of one-to-one interviews or small groups, identification of the speakers is straightforward.
In large focus groups or meetings where there may be a 'babble' of voices, this becomes more difficult and takes longer, especially if the transcriber has never heard those voices before.
Clients can help by providing a voice 'brief' or by asking speakers to identify themselves, either at the beginning or at intervals throughout the recording.
This will make it easier to match names to voices and help the transcriber 'tune in' to the different voices.
How fast people talk It may sound obvious but if someone is a fast talker, they will take longer to transcribe than someone who speaks more slowly.
For example, take two recordings - both one hour in length.
The first speaker talks slowly, 'normally' and the resulting transcription is perhaps 10,000 words long.
The second talks at 'machine gun' speed and the transcript totals 16,000 words.
Same length of recording - completely different length of transcript.
Therefore, a fast talker produces more words.
More words equal more to type, which in turn equals more time taken.
Do they speak in coherent sentences? Not everyone does! Everyday speech is usually littered with verbal habits and quirks which we generally don't 'hear' in conversation.
People switch thought in mid-sentence, add unnecessary 'you knows' and 'sort ofs', or sometimes don't speak in coherent sentences at all.
We rarely speak in the same way as we write.
In such situations, the transcriber must work out where to insert the punctuation so as not to lose the thread of the whole piece.
The more coherent the speakers are, the less time it takes to transcribe their words.
The transcriber can 'type as they talk' and rarely needs to go back and puzzle out the meaning.
Level of background noise Background noise can make or break a recording, so choosing the recording location is vital, preferably a quiet indoor environment.
Our ears can filter out most of the extraneous noise which is constantly around us, from traffic noise, equipment interference, other voices to even the background hiss from the recorder itself.
Microphones are not so selective - they pick up every sound, giving each noise equal prominence (unless noise cancelling microphones are used).
How much 'jargon' is involved? Material which is full of technical or specialised terminology may be unfamiliar to the transcriber.
It may be necessary to re-listen several times in order to distinguish the words.
It helps enormously if a glossary of keywords or some kind of brief about the topic involved can be provided by the client in advance.
Transcription companies tend to use the following guidelines when advising clients on timings.
For transcription of dictation, interviews, lectures and podcasts, it can take between 4 to 6 hours to transcribe an hour's recording.
Focus groups, meetings, conference presentations, group interviews and teleconferences can take slightly longer, typically between 6 to 10 hours.
Transcription of video recordings, the insertion of time stamps, or the identification of focus group participants can take even longer, usually between 7 to 10 hours.
These timings all relate to Intelligent Verbatim transcripts, which are defined below.
The type of transcript required will also impact on timings and costs.
The most popular, quickest and cost effective choice is Intelligent Verbatim, which ensures an accurate transcript but omits the verbal habits and meaningless fillers which litter everyday speech.
These add nothing to the context of the transcript and take longer to transcribe.
However, there are instances, such as legal interviews or university research, where a Complete Verbatim transcript may be needed.
This captures absolutely everything said, including dialect patterns, conversational style and identifiable emotions.
This usually adds 1 to 3 hours to the transcription times mentioned above.
The third transcript format is Edited, which adds between 1 to 2 hours to the standard timings.
This is very useful for some focus groups, conferences and lectures where the content is critical but the style of the speaker is irrelevant.
Complete Verbatim and Edited Transcripts are more expensive than Intelligent Verbatim to reflect the extra time needed to transcribe.
Source...