Word Clouds – much nicer than Word Lists

I have been wondering for quite some time if word lists are the best thing I can come up with as a visual support in the booth. They are not exactly appealing to the eye, after all …

So I started to play around with word cloud generators a bit to see if they are of any use. Here comes a short summary of my conclusions:

The tool I liked most was WordItOut by Enideo from the UK. You can copy and paste text or tables easily and create nice word clouds in no time.

I tested it with three kinds of documents:

  1. My personal glossary
  2. Plain text
  3. Term extraction results from SketchEngine

Personal short glossary

I like to create a shortlist of my most-important-to-remember terms and have it on display permanently in the booth. Usually, there are no more than 10 to 20 terms on this list. So I copied in a short sample glossary with numbers from 1 to 10 added behind the terms (indicating frequency but meaning importance) and the result was this:

OK, it’s monolingual, but why not add some colour to the booth and print a second one?

Of course it does not help if you don’t know the equivalents. But especially when working mainly into one target language, some colleagues tend to write down terms in their target language anyway (more insight about this subject to be published in autumn!).

And if you really like a fancy booth decoration, you can always do some manual work and create a table with the equivalents in your working languages in one field

and get your bilingual word cloud:

By the way, you can choose the font and colour or simply press the „regenerate“ button again and again until you like what you get.

My conclusion: I love it! Easy enough to use from time to time as a nice booth decoration – or use it as a desktop wallpaper, for that matter.

Plain text

When using plain text, words are displayed in varying sizes depending on their frequency in the text. While this is not as useful as term extraction, where terms are extracted based on much more complicated algorithms, it still gives you an idea of what the most frequent words in the text are. This can be useful, for example, for target language vocabulary activation (or when learning a new language?).

One downside, however, is that multi-word terms like “circular economy” are torn apart, so you would need to post-edit the list of words adding a ~ between the words you wish to be kept together.

Another problem is that when using any language other than English, no stop word list is pre-determined (you can add one, though). This means that, for example in German, you end up getting a cloud of der, die, das, und, er, sie, es, aber, weil, doch.

My conclusion: A lot of potential but little real use cases.

Term extraction results

The nicest thing is of course to have an extraction tool with a built-in word cloud generator, like SDL Trados Studio has.

But if you use other term extraction tools, you can still copy the extraction results into the word cloud generator. I used a term list extracted by SketchEngine,  copied in the list of extracted terms plus scores and the result was this:

Multi word terms are no problem at all, and the size of the terms varies according to the scores calculated by SketchEngine for each term. Much more relevant than frequency in most cases …

My conclusion: Very nice!

PS: If you are interested in terminology extraction for interpreters, Josh Goldsmith is conducting an interesting study on this subject. First results may be expected to be presented in November at the 2nd Cologne Conference on Translation, Interpreting and Technical Documentation (CGN18).


About the author

Anja Rütten is a freelance conference interpreter for German (A), Spanish (B), English (C) and French (C) based in Düsseldorf, Germany. She has specialised in knowledge management since the mid-1990s.

Airtable.com – a great replacement for Google Sheets | tolle Alternative zu Google Sheets

+++ for English see below +++

Mit der Terminologieverwaltung meiner Träume muss man alles können: Daten teilen, auf allen Geräten nutzen und online wie offline darauf zugreifen (wie mit Interpreters’ Help/Boothmate für Mac oder auch Google Sheets), möglichst unbedenklich Firmenterminologie und Hintergrundinfos des Kunden dort speichern (wie bei Interpreters’ Help), sortieren und filtern (wie in MS Access, MS Excel, Lookup, InterpretBank, Termbase und anderen), individuelle Voreinstellungen wie Abfragen und Standardwerte festlegen (wie in MS Access) und, ganz wichtig: den Terminologiebestand so durchsuchen, dass es kaum Aufmerksamkeit kostet, also blind tippend und ohne Maus, eine inkrementelle Suche, die sich nicht darum schert, ob ich “rinon” oder “riñón” eingebe, und mir so oder so sagt, dass das Ding auf Deutsch Niere heißt, möglichst in Form einer gut lesbaren Trefferliste (wie Interplex und InterpretBank es tun).

Airtable, eine gelungene Mischung aus Tabellenkalkulation und Datenbank, kommt der Sache ziemlich nah. Es ist sehr intuitiv in der Handhabung und sieht einfach gut aus. Das Sortieren und Filtern geht sehr leicht von der Hand, man kann jedem Datensatz Bilder, Dateianhänge und Links hinzufügen und unterschiedliche Abfragen (“Views”)  von Teilbeständen der Terminologie (etwa für einen bestimmten Kunden, ein Thema, eine bestimmte Veranstaltungsart oder eine Kombination aus allem) definieren und auch Standartwerte für bestimmte Felder festlegen, damit man z. B. den Kundennamen, die Konferenzbezeichnung und das Thema nicht jedesmal neu eingeben muss. Die Detailansicht, die aufpoppt, wenn man auf eine Zeile klickt, ist auch super.  Eigene Tabellen lassen sich in Nullkommanix per Drag & Drop einfügen oder importieren. Und im Übrigen gibt es eine Menge nützlicher Tastenkombinationen.

Teamglossare (oder was auch immer) können von verschiedenen Personen über die iPad-, iPhone- oder Android-(beta)-App oder die Browseroberflächer bearbeitet werden. Allerdings können bei Zugriff über den Browser die Daten nicht offline bearbeitet und später online synchronisiert werden. Das funktioniert nur über die mobile App. Die Daten werden bei der Übermittlung und Speicherung verschlüsselt.

Nur eine Sache vermisse ich bei Airtable schmerzlich, nämlich die oben beschriebene intuitive, akzent-ignorierende Suchfunktion, die ihre Fundstücke in einer Trefferliste präsentiert, statt mich von Suchergebnis zu Suchergebnis hüpfen zu lassen. Ansonsten aber eine wahrhaft schnuckeliges Datenbankanwendung, nicht nur für Terminologie!

Airtable ist kostenlos, solange jede Tabelle nicht mehr als 1500 Zeilen umfasst. Für bis zu 5000 Zeilen bezahlt man 12 $ monatlich und für bis zu 50 000 Zeilen 24 $.

Übrigens: Eine Übersicht von am Markt verfügbaren Terminologieverwaltungsprogrammen für Dolmetscher findet sich hier.


Über die Autorin:
Anja Rütten ist freiberufliche Konferenzdolmetscherin für Deutsch (A), Spanisch (B), Englisch (C) und Französisch (C) in Düsseldorf. Sie widmet sich seit Mitte der 1990er dem Wissensmanagement.

+++ English version +++

My perfect terminology database must be shareable, portable and accessible both on and off line (like Interpreters’ Help/Boothmate for Mac and also Google Sheets) but at the same time trustworthy to the point that companies feel comfortable having their terminology stored there (like Interpreters’ Help), sortable and filterable (like MS Access, MS Excel, Lookup, InterpretBank, Termbase and others), customisable with pre-defined views and default values (like in MS Access) and, very importantly, searchable in a way that requires almost no attention – meaning a mouse-free, incremental search function that does not care whether I type “rinon” or “riñón” and tells me that it is kidney in English either way (like Interplex and InterpretBank do), if possible in an easy-to-read hit list.

Airtable, a mix of spreadsheet and database, seems to get very close to it. It is very intuitive to handle and, even more so, it looks just nice and friendly. It has very comfortable sorting and filtering, you can add pictures, links and files, define different views of subsets of your data (like for a specific customer, particular subject area, type of conference or a combination thereof) and set default values so that, while working at a given conference, you don’t need to type the conference name, customer and subject area time and again when entering new terms. And the detailed view of each data set popping up at one click or tap is just lovely. You can import or drag and drop your tables in no time. And Airtable has loads of useful keyboard shortcuts, by the way.

Team glossaries (or anything else) can be worked on by several people and accessed via an iPad, iPhone and Android (beta) app or the browser-based interface, although, when using the browser interface, there is no way to edit your data offline and update the online version later. This works on the mobile apps only. Data being transferred back and forth as well as stored data are encrypted.

The one thing I miss most on Airtable is an intuitive, accent-ignoring search function as described above, which displays hit lists instead of jumping from one search hit to the next. But apart from that, Airtable is just great for data management, not only in terms of „terms“.

It is free of charge as long as your tables don’t have more than 1500 lines, costs 12 $ per month for up to 5000 lines per database and 24 $ for up to 50 000 lines per database.

If you need an overview of available terminology management tools for conference interpreters, click here.


About the author:
Anja Rütten is a freelance conference interpreter for German (A), Spanish (B), English (C) and French (C) based in Düsseldorf, Germany. She has specialised in knowledge management since the mid-1990s.