About Term Extraction, Guesswork and Backronyms – Impressions from JIAMCATT 2018 in Geneva

JIAMCATT is the International Annual Meeting on Computer-Assisted Translation and Terminology, a IAMLAP taskforce where most international organizations, various national institutions and academic bodies exchange information and experience in the field of terminology and translation. For this year’s JIAMCATT edition in Geneva, I had the honour of running a workshop on Tools for Interpreters – and idea I found absolutely intriguing, as the audience would not necessarily be interpreters, but translators, terminologist and heads of language, conference and/or documentation services. So I chose a hands-on workshop setting called „an hour in the shoes of a conference interpreter“. Participants had to prepare a meeting using different tools and would then listen to a 10 minute sequence of this meeting and see how well they felt prepared.

The meeting to be prepared was a EP Special Committee on the Union’s authorisation procedure for pesticides on April 12, 2018. Participants could work in two possible scenarios:

Scenario 0: Interpreters haven’t received any documents and hardly any info about the conference. They have to guess and prioritise more than those working under Scenario 1.

Scenario 1: Interpreters have received all the documents one hour in advance (quite realistic a scenario, as Marcin Feder from the EP pointed out).

The participants were free to choose to work either alone or in a team. They were encouraged to test/evaluate one of the tools presented:

InterpretBank, a Computer-Aided Interpreting tool that covers many elements of an interpreters‘ workflow, like glossary creation, multi-dictionary search, term extraction, document annotation, quick search in the booth and flashcard learning.

InterpretersHelp, a cloud-based Computer-Aided Interpreting tool that allows online shared glossary creation, glossary sharing with the community, manual term extraction and flashcard learning, as well as document and job management.

OneClickTerm, a browser-based term extraction tool

GT4T, a plugin for looking up words in several online dictionaries or machine translation sites

Sb.qtrans.de, a toolbar for consulting several online dictionaries and encyclopaedias

At the end of the exercise, the participants watched the EP Special Committee on the Union’s authorisation procedure for pesticides on April 12, 2018 of the committee meeting. What followed was a lively and inspiring discussion, where each group described their workflows and how efficient they thought it was.

Those who had the relevant documents and ran them through the OneClick term extraction found that most critical terms that came up in the speech were in the extracted list. Others found the relevant documents by way of internet research and did the same.

Quickly installing programs or creating test accounts didn’t work out as easily for everyone, so some participants reverted to creating glossaries – common practice in the „real world“ – and felt well prepared with that. Ten terms of their glossary were mentioned in the 10 minute video sequence. Others spent so much time familiarising themselves with the new tools that they didn’t feel well prepared but were very happy with what they had seen of InterpreterHelp and OneClickTerm.

When it comes to preparing for an EU meeting – at least when working from and into EU languages – there is an abundance of information available on the internet. It became clear once more that EU interpreters, in terms of meeting preparation, live in paradise. The EP legislative observatory, IATE and Eurlex were the main sources of information mentioned. I was happy to learn from Mariangeles Torrent (SCIC) that Prelex has not disappeared, but simply has turned into a tab within Eurlex named „legislative procedures„.

A short discussion about the pros and cons of Eurlex led to the conclusion that for interpreters it would be wonderful to have more than three languages displayed in parallel, and possibly a term extraction feature or technical terms highlighted in the text. Josh Goldsmith had the news that by adding a hyphen plus the language code in the url of the multilingual display, a fourth, fifth etc. language can indeed be added, although the page layout is far from perfect then. For the moment I have decided to stick to the method I have been using for over ten years, which consists of copying and pasting the columns into an Excel spreadsheet.

I was very glad to hear one participant mention the word „thinking“ in the context of conference preparation. He looked at the agenda and the first thing he did was think about what the meeting might be about. He then did some background research in Wikipedia and other sources and looked up product names, which actually were mentioned in the speech. He also checked who were the members of the committee, who didn’t appear in this part of the meeting, but would otherwise have been useful.

While terms and glossaries were clearly the topics most intensely discussed, it became clear that semantic and context knowledge is crucial for interpreters to get a grasp of the situation they are working in. For as much as I appreciate a list of extracted terms from a meeting document as a last minute preparation, there is no such thing as understanding the content people are referring to. Hence my enthusiasm about the fact that the different semiotic levels (terms, content, context) did come up in the discussion. And indeed the notes I took while listening to the speech reflect the same thing: sometimes my doubts or reflections were simply about terms (how do you say co-formulant or low risk active substances in German), some about the situation (Can beer and talc be on the list of basic substances? Is the non-native speaker sure that this is the right word?) and some about meaning (What exactly is a candidate for substitution?).

It was also very interesting to see how different ways of preparing a meeting turned out to be useful in the meeting. Obviously, there is not just one way to success in meeting preparation.

Among the software features participants would like to see to support the information and knowledge work in conference interpreting, there seemed to be a wide consensus that term extraction and markup of glossary terms in meeting documents – like InterpretBank and Intragloss offer – are extremely useful. Text summarisation was also mentioned. Several participants found InterpretBank’s speech to text integration (based on Dragon) very interesting, but unfortunately, due to practical restraints we couldn’t test this.

When it comes to search functions, it is crucial that intuitive searching is possible in the relevant (!) documents and sources. Relevance seems to be an important factor in conference preparation. What with the abundance of information available nowadays, finding out what is really useful is key. However, many of the big international organisations like EU, UN and WTO do have very useful document management systems in place which help to find one’s way around.

From a freelancer’s perspective, I think that organizations should rather go for browser-based, i.e. device-independent systems to support their interpreters. This lowers the entry barrier of having to install something on each computer, apart from facilitating mobile access and online collaboration. Although I must say that I do also fancy the idea of a small plugin that works in any software, like my most recent discovery, GT4T. At least as freelancers, we change settings so often (back and forth from personal computers to mobile devices, Excel sheets, shared Google docs, paper, institutional information management systems etc.) that a self-contained environment for conference interpreters is maybe too clumsy and unrealistic. After all, hotkeys seem to be back in fashion: I also heard from the WTO colleagues that they have developed a tool quite along the same lines, creating special hotkeys for translators.

And finally, my favourite newly learnt word: Backcronym

Backronyms are acronyms that used to be normal words and were re-interpreted later. While translators have a chance to think twice or recognise the word as a backronym because it is written in capitals, interpreters may struggle much more with this. It may take us a moment or two to figure out that the sentence „we need to do what PIGS do“ refers to a „Professional Interpreters‘ Gymnastics Society“ rather than an animal.

Further reading:

Workhop Presentation (pdf) JIAMCATT 2018 Tools for Interpreters

Teresa Ortego Antón (2015): Terminology management tools for conference interpreters: an overview. In: Eleftheria Dogoriti  Theodoros Vyzas (editors): International Journal of Language, Translation and Intercultural Communication, Vol 5 (2016), Editors: Technological Educational Institute of Epirus, Greece. 107-115.

Hernani Costa, Gloria Corpas Pastor, Isabel Durán Muñoz (LEXYTRAD, University of Malaga, Spain): A comparative User Evaluation of Terminology Management Tools for Interpreters. In: Proceedings of the 4th International Workshop on Computational Terminology, 23 August 2014, Dublin, Ireland. 68-76
Anja Rütten (2017): Terminology Management Tools for Conference Interpreters –
Current Tools and How They Address the Specific Needs of
Interpreters. In: Translating and the Computer 39, Proceedings, 16-17 November 2017, AsLing, The International Association for Advancement in Language Technology, London, England. 98 ff



New Term Extraction Features in InterpretBank and InterpretersHelp – Thumbs up!

Extracting terminology from preparatory texts into a term database seems to be the hot topic of the moment, judging by what the two most active and innovative CAI (computer-assisted interpreting) tools, InterpretBank and InterpretersHelp, are working on at the moment.  So while I am still waiting to become a Windows beta tester of Intragloss, the pioneer in this field, I am eager to have a go at both InterpretBank5’s (beta) and InterpretBank’s (experimental) new extraction features.

InterpretBank by Claudio Fantinuoli has been adding quite some time-saving features for conference preparation lately. Apart from searching online ressources on the go while building your glossary, it now promises to extract terminology from your glossaries, view original and translation in parallel and link documents to glossaries. This does indeed sound like Intragloss combined with the sophisticated booth-friendly terminology management system that InterpretBank has been for many years. So off we go!

As you can see in the picture, a new „documents“ icon has been added to the familiar three others (editing, conference mode, flashcards). When I press the magic button, the documents pane appears in the bottom left corner and lets me add documents like pdf or pptx in my two languages and display them next to each other. Unfortunately, there is no synchronised scrolling and no search function to look up word in the documents, but these functions are to be implemented soon. The selected documents are now linked to the glossary, so whenever this particular glossary is opened, they will appear in the documents pane. Highlighting words in the two texts and inserting them into the glossary or looking up translations in my favourite online resources (like IATE, Linguee, Pons, LEO and others more) works so swiftly, when I first tried it the terms were in my glossary before I had even noticed.

For English texts, context examples can be looked up using the right mouse button or using the icon in the list of extracted terms.  And what’s great for sharing with colleagues and for using in the booth: The text can be opened in a separate window and annotated with records from the glossary:

Automatic extraction of terminology or key concepts so far only works for English, but will be implemented for other languages, too (German, Spanish, French and Italian are planned to be released in April). Quality of extraction, as always, depends on many factors, like the amount of text and the subject area, but it is good to get a first impression of the subject matter at hand.

InterpretBank as a locally installed application raises no confidentiality issues with your client’s documents being opened and processed, as everything InterpretBank does happens on your computer (unless you use the „send document to any device“ option).

If you are more of a team glossary and online networking person, InterpretersHelp by Yann Plancqueel and Benoît Werner is the other option to manage glossaries and manually extract terminology from texts. It is quite straightforward: Adding documents works via Copy & Paste, you just paste the text into a field for the respective language so you have the two language versions displayed next to each other (but with no synchronised scrolling either). When I tried it, inserting 20 pages from a pdf worked fine. Words can be looked up in the texts using the browser search function.

The highlighting and inserting also works very swiftly and you can look up terms in Google Translate and the Oxford Dictionaries. Once you have extracted all the vocab you need, you press a button to add all the new entries to your glossary. When changing back from the glossary view to the extractor, the texts have disappeared.

InterpretersHelp as a cloud-based tool addresses the data protection issue by encrypting the data that transit to and from the website (https://interpretershelp.com/help/secure_hosting).

Of course there are zillions of other functions interpreters need for CAI tools to support their workflow perfectly. But I think that both InterpretBank and InterpretersHelp have added one super useful feature to make our lives easier. Thanks a lot!

About the author:
Anja Rütten is a freelance conference interpreter for German (A), Spanish (B), English (C) and French (C) based in Düsseldorf, Germany. She has specialised in knowledge management since the mid-1990s.

Further reading:

Summary Table of Terminology Tools for Interpreters. <www.termtools.dolmetscher-wissen-alles.de>

Josh Goldsmith: The Interpreter’s Toolkit: Interpreters’ Help – a one-stop shop in the making?. In: aiic.net February 12, 2018. <http://aiic.net/p/8499>.

Anja Rütten: InterpretBank 4 Review. 31 July 2017. <http://blog.sprachmanagement.net/interpretbank-4-review/>.

Alexander Drechsel: App profile: Interpreters‘ Help. 2 Oct 2015. <https://www.adrechsel.de/dolmetschblog/interpretershelp>.

Anja Rütten: Booth-friendly terminology management revisited – two newcomers. 29 April 2014. <http://blog.sprachmanagement.net/booth-friendly-terminology-management-revisited-2-newcomers/>








Dictation Software instead of Term Extraction? | Diktiersoftware als Termextraktion für Dolmetscher?

+++ for English see below +++

Als neulich mein Arzt bei unserem Beratungsgespräch munter seine Gedanken dem Computer diktierte, anstatt zu tippen, kam mir die Frage in den Sinn: „Warum mache ich das eigentlich nicht?“ Es folgte eine kurze Fachsimpelei zum Thema Diktierprogramme, und kaum zu Hause, musste ich das natürlich auch gleich ausprobieren. Das High-End-Produkt Dragon Naturally Speaking, von dem mein Arzt schwärmte, wollte ich mir dann aber doch nicht gleich gönnen.  Das muss doch auch mit Windows gehen und mit dem im Notebook eingebauten Raummikrofon, dachte ich mir (haha) … Eingerichtet war auch alles in Nullkommanix (unter Windows 10 Auf Start klicken, den Menüpunkt „Erleichterte Bedienung“ suchen, “ Windowsspracherkennung“ auswählen) und los ging’s. Beim ersten Start durchläuft man zunächst ein kurzes Lernprogramm, das die Stimme kennenlernt.

Und dann konnte es auch schon losgehen mit dem eingebauten Diktiergerät, zunächst testhalber in Microsoft Word. Von den ersten zwei Spracheingaben war ich auch noch einigermaßen beeindruckt, aber schon bei „Desoxyribonukleinsäure“ zerplatzten alle meine Träume. Hier meine ersten Diktierproben mit ein paar gängigen Ausdrücken aus dem Dolmetschalltag:

– 12345
– Automobilzulieferer
– Besserungszeremonien Kline sollte es auch viel wie Wohnen Nucleinsäuren für das (Desoxyribonukleinsäure)
– Beste Rock Siri Wohnung Klee ihnen sollte noch in Welle (Desoxyribonukleinsäure)
– Verlustvortrag
– Rechnungsabgrenzungsposten
– Vorrats Datenspeicherung
– Noch Händewellenlänge (Nockenwelle)
– Keilriemen
– Brennstoffzellen Fahrzeuge

Gar nicht schlecht. Aber so ganz das Spracherkennungswunder war das nun noch nicht. In meiner Phantasie hatte ich mich nämlich in der Dolmetschvorbereitung Texte und Präsentationen entspannt lesen und dabei alle Termini und Zusammenhänge, die ich im Nachgang recherchieren wollte, in eine hübsche Tabelle diktieren sehen.  Aber dazu musste dann wohl etwas „Richtiges“ her, wahrscheinlich zunächst einmal ein gescheites Mikrofon.

Also setzte ich mich dann doch mit der allseits gepriesenen Diktiersoftware Dragon Naturally Speaking auseinander, chattete mit dem Support und prüfte alle Optionen. Für 99 EUR unterstützt die Home-Edition nur die gewählte Sprache. Die Premium-Version für 169 EUR unterstützt die gewählte Sprache und auch Englisch. Ist die gewählte Sprache Englisch, gibt es nur Englisch. Möchte ich mit Deutsch, Spanisch, Englisch und womöglich noch meiner zweiten C-Sprache Französisch arbeiten, wird es also erstens kompliziert und zweitens teuer. Also verwarf ich das ganze Thema erst einmal, bis wenige Tage später in einem völlig anderen Zusammenhang unsere liebe Kollegin Fee Engemann erwähnte, dass sie mit Dragon arbeite. Da wurde ich natürlich hellhörig und habe es mir dann doch nicht nehmen lassen, sie für mich und Euch ein bisschen nach ihrer Erfahrung mit Spracherkennungssoftware auszuhorchen:

Fee Engemann im Interview am 19. Februar 2016

Wie ist die Qualität der Spracherkennung bei Dragon Naturally Speaking?

Erstaunlich gut. Das Programm lernt die Stimme und Sprechweise kennen und man kann ihm auch neue Wörter „beibringen“, oder es liest über sein „Lerncenter“ ganze Dateien aus. Man kann auch Wörter buchstabieren, wenn das System gar nichts mehr versteht.

Wozu benutzt Du Dragon?

Ich benutze es manchmal als OCR-Ersatz, wenn eine Übersetzungsvorlage nicht maschinenlesbar ist. Das hat den Vorteil, dass man gleich den Text einmal komplett gelesen hat.

In der Dolmetschvorbereitung diktiere ich meine Terminologie in eine Liste, die ich dann nachher durch die Begriffe in der anderen Sprache ergänze. Das funktioniert in Word und auch in Excel. Falls es Schwierigkeiten gibt, liegt das evtl. daran, dass sich die Kompatibilitätsmodule für ein bestimmtes Programm deaktiviert haben. Ein Besuch auf der Website des technischen Supports schafft hier Abhilfe. Für Zeilenumbrüche und viele andere Befehle gibt es entsprechende Sprachkommandos. Wenn man das Programm per Post bestellt und nicht als Download, ist sogar eine Übersicht mit den wichtigsten Befehlen dabei – so wie auch ein Headset, das für meine Zwecke völlig ausreichend ist. Die Hotline ist im Übrigen auch super.

Gibt es Nachteile?

Wenn ich einen Tag lang gedolmetscht habe, habe ich danach manchmal keine Lust mehr, mit meinem Computer auch noch zu sprechen. Dann arbeite ich auf herkömmliche Art.

Wenn man in unterschiedlichen Sprachen arbeitet, muss man für jede Sprache ein neues Profil anlegen und zwischen diesen Profilen wechseln. Je nach Sprachenvielfalt in der Kombination könnte das lästig werden.

Mein Fazit: Das hört sich alles wirklich sehr vielversprechend an. Das größte Problem für uns Dolmetscher scheint – ähnlich wie bei der Generierung von Audiodateien, also dem umgekehrten Weg – das Hin und Her zwischen den Sprachen zu sein. Wenn jemand von Euch dazu Tipps und Erfahrungen hat, freue ich mich sehr über Kommentare – vielleicht wird es ja doch noch was mit der Terminologieextraktion per Stimme!

Über die Autorin:
Anja Rütten ist freiberufliche Konferenzdolmetscherin für Deutsch (A), Spanisch (B), Englisch (C) und Französisch (C) in Düsseldorf. Sie widmet sich seit Mitte der 1990er dem Wissensmanagement.

+++ English version +++

The other day, when I was talking to my GP and saw him dictate his thoughts to his computer instead of typing them in, I suddenly wondered why I was not using such a tool myself when preparing for an interpreting assignment? So I asked him about the system and, back home, went to try it myself straight away. Although what I was planning to do was not to buy the high-end dictation program Dragon Naturally Speaking I had been recommended, but instead to go for the built-in Windows speech recognition function and the equally built-in microphone of my laptop computer (bad idea) … The speech recognition module under Windows 10 was activated in no time (got to the Start menu, select „Ease of Access > Speech Recognition„) and off I went.

When the voice recognition function is first started, it takes you through a short learning routine in order to familiarise itself with your voice. After that, my Windows built-in dictation device was ready. For a start, I tried it in Microsoft Word. I found the first results rather impressive, but when it came to „Desoxyribonukleinsäure“ (deoxyribonucleic acid), I was completely disillusioned. See for yourselves the results of my first voice recognition test with some of the usual expressions from the daily life of any conference interpreter:

– 12345
– Automobilzulieferer
– Besserungszeremonien Kline sollte es auch viel wie Wohnen Nucleinsäuren für das (Desoxyribonukleinsäure)
– Beste Rock Siri Wohnung Klee ihnen sollte noch in Welle (Desoxyribonukleinsäure)
– Verlustvortrag
– Rechnungsabgrenzungsposten
– Vorrats Datenspeicherung
– Noch Händewellenlänge (Nockenwelle)
– Keilriemen
– Brennstoffzellen Fahrzeuge

Not bad for a start – but not quite the miracle of voice recognition I would need in order to live this dream of dictating terminology into a list on my computer while reading documents to prepare for an interpreting assignment. Something decent was what I needed, probably a decent microphone, for a start.

So I enquired about the famous dictation software Dragon Naturally Speaking, chatted with one of the support people and checked the options. For 99 EUR, Dragon’s Home Edition only supports one language. The Premium Edition for 169 EUR supports one selected language plus English (If you choose English when buying the software, it is English-only.)  If I want German, Spanish, English and possibly also my second C-language, French, it gets both complicated and expensive. So I discarded the whole idea until, only a few days later, our dear colleague Fee Engemann happened to mention to me – in a completely different context – that she actually worked with Dragon! I was all ears and spontaneously asked her if she would like to share some of her experience with us in an interview. Luckily, she accepted!

Interview with Fee Engemann February 19th, 2016

What is the voice recognition quality of Dragon Naturally Speaking like?

Surprisingly good. The program familiarises itself with your voice and speech patterns, and you can also „teach“ it new words, or let it read loads of new words from entire files. You can also spell words in case the system does not understand you at all.

What do you use Dragon for?

I use it as an OCR substitute when I get a text to translate which is not machine-readable. The big advantage is that once you have done that, you know the entire text.

When preparing for an interpreting assignment, I dictate my terminology into a list and add the equivalent terms in the other language once I have finished reading the texts. That works in MS-Word and MS-Excel. If there are problems, this may be due to the compatibility module for a certain program being deactivated. The technical support website can help in this case. There are special commands for line breaks and the like. And if you order the software on a CD (instead of simply downloading it), your parcel will not only include a list with the most important commands, but also a headset, which is absolutely sufficient for my purpose. And by the way … the hotline is great, too.

Are there any downsides?

After a whole day of interpreting, I sometimes don’t feel like talking to my computer. In this case, I simply work the traditional way.

When working with several languages, you must create one profile per language and switch between them when switching languages. This may be quite cumbersome if you work with many different languages.

My personal conclusion is that this all sounds very promising. As always, our problem as conference interpreters with these technologies (just like when creating multilingual audio files, i.e. the other way around) seems to be the constant changing back and forth between languages. If any of my readers has experience or good advice to share, I will be happy to read about it in the comments – maybe voice-based term extraction is not that far away after all!


About the author:
Anja Rütten is a freelance conference interpreter for German (A), Spanish (B), English (C) and French (C) based in Düsseldorf, Germany. She has specialised in knowledge management since the mid-1990s.