Yesterday, in a Remote Simultaneous Interpreting team with colleagues being distributed all over Europe, it suddenly occurred to us to play around a bit with live transcription as a support (thanks to Mike Morandin’s innocent question if anyone had ever used it). No sooner said than done – within a few minutes our wonderful chef d’équipe, Peter Sand, had sent us a link to his live transcript on Otter.ai! All it needed was to create an account and paste the link to our Zoom meeting. Otter appeared in the participants list just like any human participant, which is nice because the host is aware that the tool is listening to the meeting.
I think it is fair to say that we were all quite impressed with the quality and the speed of the transcription. We had English speakers from different origins and with a variety of accents, and the software handled them all really well. And indeed, as their website says: Otter can handle a wide variety of accents, including (southern) American, Canadian, Indian, Chinese, Russian, British, Scottish, Italian, German, Swiss, Irish, Scandinavian, and other European accents. Actually, once we’d seen the live transcript and its quality, we wouldn’t want to miss in anymore. Of course, on the one hand, we all agreed that it was distracting and wouldn’t be suitable for a sight translation (and the constant correction of the text that could also be irritating). I think we actually became aware of how much we rearrange sentences and really “interpret” the speech we hear in order to make it sound natural and easy to follow for our audience. But on the other hand, it feels like a kind of safety net in case you missed a number, name or whatever. So, yet another cognitive task to be handled, and having just the numbers and entities filtered out and displayed like CAI tools (SmarTerp, ABM) do is probably more helpful. When working with a full transcript, you definitely have to learn to look away when you don’t need it.
What I also liked about Otter was the summary function. If you want to catch up on what’s been discussed while you had popped out for a coffee, it’s a good thing for sure. Furthermore, you can also ask Otter questions about the meeting afterwards, or even have it draft the minutes.
Another promising feature I have not tested yet seems to me the manage vocabulary function. Here you can teach Otter your own vocabulary to make sure it is recognised correctly.
Otter has several subscription plans: One is free of charge and includes 300 minutes of transcription, limited to 30 consecutive minutes per meeting. You can buy 1200 minutes of transcription per month for 17 USD, and there are better value yearly plans.
The California-based company claims to have incorporated GDPR and SOC 2 standards into its data practices, data is stored on Amazon’s AWS servers and encrypted (https://otter.ai/privacy-security). “Otter uses a proprietary method to de-identify user data before training our models so that an individual user cannot be identified. This training method is automatic and as such audio recordings and transcripts are not manually reviewed by a human. Additionally our training data is encrypted.”
So far, so good. On downside, however, otter.ai does not support any language other than English in all its varieties. So we then went on to test another live transcription tool, Airgram.io. This one works for English, German, French, Spanish, Portugese (beta), Japanese, Chinese (Mandarin), Russian.
My impression was that Otter was slightly faster than Airgram, but the quality was very similar at first glance. And both Otter and Airgram were both were much better in quality than the live transcript I saw in Zoom. We were only able to test the English transcription during the meeting, as it was about to end. I just did a quick test with German later using a video – it was ok-ish, but there were the occasional mistakes that distorted the meaning of what was being said … so still room for improvement, but I suppose it would be useful to have as a backup in the booth anyway.
Airgram also has a great feature called “AI Topic” which displays items in the categories Price & Number, Date & Time, Person, Organisation, Location and Title. To my mind, the most interesting feature for conference interpreters!
Just like Otter, Airgram offers a free subscription plan limited to 30 minutes per meeting and a “plus” plan for 18 USD/month with a maximum of 5 hours of transcription per meeting.
Airgram is also based in the USA (Delaware). Its transcription software is proprietary, and AI-generated content such as summaries is provided using GPT or Claude. The transcribed text is not used as training data by Airgram, but possibly by GPT or Claude (see https://www.airgram.io/privacy). Airgram is SOC 2 and GDPR compliant and, just like Otter, encrypts all data and stores them on Amazon Web Services (AWS).
So far for now from my first on the job experience with live transcription. As always, I would be really interested to know your thoughts or experiences!
About the author:
Anja Rütten is a freelance conference interpreter for German (A), Spanish (B), English (C) and French (C) based in Düsseldorf, Germany. She has specialised in knowledge management since the mid-1990s.