It was our second semester as MSc Applied Linguistics, when we were assigned to collect video/audio recordings from Internet or record our own, and then transcribe them. This project was a hell lot of difficult and some of my class fellows were so angry due to the difficulty level. The details were simple: You were assigned a topic e.g. Lectures, Speeches, TV News, Radio News; Record your specific genre or take it from Youtube.com; Listen and Transcribe it; and Tag it with appropriate tags.
Well the process is not difficult. The time which it takes to complete all this annoys people. Transcription is one of the most time consuming jobs of the world. Normally 1 minute of spoken recording can take upto 6 to 7 minutes for writing it. So you can see for 1 hour of spoken recording one will have to spend upto 7 hours of listening and typing it. This is not that simple, it is not just that you play it and start typing. You will always have to stop the recording again and again, sometimes it would be because your typing speed will not match with that of speaking speed of the speakers, other times you may not get the clear idea what the speaker is saying so you'll have to replay the audio to concentrate on it and get what the speaker is trying to say, still other times you'll have to stop and think how to write down uhmms, errrs, overlaps etc. The situation gets worst when it comes to Talk Shows, Telephonic or Live Conversations, Lectures, Question Answer Sessions. Remember the more the speakers, the more the distractions and more time consumption on transcription. We considered those people lucky ones who got Speeches, TV News, Radio News etc. Because all these genres are spoken by one person, and secondly they are usually scripted i.e. the spoken material is written in front of the speaker so s/he has to speak it out only. But in spontaneous talks like Talk Shows this is not the case. There are more speakers, there are overlappings i.e. two people start speaking at same time, there are errrrs, hmmmmmms and other unnecessary sounds which the speaker utter. But we cannot ignore these sounds. We can understand two people speaking at same time in spoken audio, but when it comes to transcription we have to devise a method to show that these particular sentences or words were uttered by both of the speakers at same time i.e. overlapping. Here we have to use Tags to show this phenomenon. Now either we can devise our own tags or we can use tags which are devised by someone else. Since as students we work for the completion of International Corpus of English i.e. ICE Pakistan Component, we have to use their devised method of Tagging. The tagging scheme is available here.
Now what should be done here? It is simple you'll have to go through all of this document. Because you are going to listen and transcribe your spoken recordings not me. So if you do not understand it, it wouldn't work. I can only provide an example by transcribing a few lines of a video from Youtube.
is bythe federal government
<$B> Ok uhmm <}><->I've<=>I've jsut lit literally about half a minute Mr. Babar. Let me just ask you this question that is come uhm from Asif who is watching from Canada....
I've just covered first 12 seconds of the above video and it took me 5 minutes to cover all the things, to WRITE DOWN what these people were performing as a routine speaking activity. The video starts with an unclear word. I had to replay it several times when I couldn't get exactly I put "is by" by my own guess and put tags around it. You can see in first line the tags
, they show that the words were not intelligible. Of course I've consulted the ICE Manual (link provided above) for this tag. Even before this tag you can see the <$A> tag, which shows the first speaker. And this tag I have also got after going through the manual which says that every speaker's utterance should be started on a new line marked with speaker identity i.e. first speaker would be A, second would be B and so on. And you can see there involve two speakers in first 12 seconds and I've shown both separately with their utterances on new line with <$A>, <$B> tags. And then you may be able to see that the hostess says 'uhmm' after saying 'Ok', we cannot ignore it while transcribing. Because these hesitations and uhmms can be helpful in Discourse Analysis of this transcribed text. So I had to write this nonsense and apparently meaningless utterance. And then there is 'I've I've' with these weird looking tags <}><->I've<=>I've. They show actually repetition, and of course again I had to search in ICE Manual for tags of Repetition and I got these. So I pasted them and added the repeated words according to the example given there in the manual. And this way it goes on. You listen, you type it. When you see overlapping, repetition, hesitation, uhms you mark it, when you do not understand a spoken word you replay it, and in the mean time you consult the manual as well for every new phenomenon you encounter so you can record it properly. Now you may understand why it is very difficult to transcribe a text, and why it is necessary to TAG it. But as you practice you will be more faster and accurate, you'll consume less time.
Hopefully this small effort will help. Ask me in the comments of this post if you are still unable to get what I wanted to say, or if you want details of some specific area.