Zoom: How to get automatic transcription tools to work for you

Transcriptions and captions are useful tools that can make video and audio learning materials more accessible not only to students with hearing impairments but also to  students who are learning in a foreign language, students working in noisy environments or even students who do not have good quality internet connectivity.

We want the transcripts to be 100% accurate but this is not achievable with automatic transcription software and technology available today.  However, research has shown that even with 25% Word Error Rate (or less) were found to be useful and usable (Munteanu, et al, 2006).

At UCEM, we have rolled out Zoom as our webinar software,  which  provides a feature of automatic  transcriptions.

Zoom recording showing transcript
Zoom recording showing transcript

So now all our recorded webinars show a transcript on the side. This allows students to search for a certain word in the webinar and then jump to the location of the video directly.

However, the transcript is not 100% accurate.  According to  Jonathan Dame ‘s Blog post  Automated transcription services could be integral to web conferencing,  with ideal conditions Zoom’s transcription feature is 90% accurate and with high quality recordings it is up to 92.34% accurate. In a small study conducted at UCEM, we got the highest accuracy of nearly 79% and lowest of almost 71% with Zoom.  You can read the full paper Automatic Transcription Software: Good Enough for Accessibility? A Case Study from Built Environment Education.

But how can you increase the chance of your voice being transcribed more accurately?

  • Use good quality recording equipment – in our small experiment we found that it was more effective to use headset microphone than the built-in laptop microphone.
  • Recording environment – if there is surrounding sounds it makes it harder to get a good recording that an automatic transcription software can process to create an accurate transcript. So try to find a quiet location.
  • Speak clearly and don’t speak too fast. Especially if you are reading something you are likely to read it faster than you normally speak so be aware of this. For example, we have noticed that in webinars when students post questions on the chat tutors read it out as written. Sometimes these questions are not written with correct grammatical structure and especially when this is the case we see the automatic transcription struggling to transcribe these accurately. So please restructure the sentences if the typed in questions are not grammatically correct.

If you are speaking in a webinar that will be recorded and transcribed automatically try to follow above tips. Do you have any other tips ? Please share with us.


Munteanu, C., Baecker, R., Penn, G., Toms, E., & James, D. (2006). The Effect of Speech Recognition  accuracy Rates on the Usefulness and Usability of Webcast Archives,  In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems – CHI, Montreal, Canada,  22-27 April 2006. DOI: http://doi.acm.org/10.1145/1124772.1124848