Word Clouds – much nicer than Word Lists

I have been wondering for quite some time if word lists are the best thing I can come up with as a visual support in the booth. They are not exactly appealing to the eye, after all …

So I started to play around with word cloud generators a bit to see if they are of any use. Here comes a short summary of my conclusions:

The tool I liked most was WordItOut by Enideo from the UK. You can copy and paste text or tables easily and create nice word clouds in no time.

I tested it with three kinds of documents:

  1. My personal glossary
  2. Plain text
  3. Term extraction results from SketchEngine

Personal short glossary

I like to create a shortlist of my most-important-to-remember terms and have it on display permanently in the booth. Usually, there are no more than 10 to 20 terms on this list. So I copied in a short sample glossary with numbers from 1 to 10 added behind the terms (indicating frequency but meaning importance) and the result was this:

OK, it’s monolingual, but why not add some colour to the booth and print a second one?

Of course it does not help if you don’t know the equivalents. But especially when working mainly into one target language, some colleagues tend to write down terms in their target language anyway (more insight about this subject to be published in autumn!).

And if you really like a fancy booth decoration, you can always do some manual work and create a table with the equivalents in your working languages in one field

and get your bilingual word cloud:

By the way, you can choose the font and colour or simply press the “regenerate” button again and again until you like what you get.

My conclusion: I love it! Easy enough to use from time to time as a nice booth decoration – or use it as a desktop wallpaper, for that matter.

Plain text

When using plain text, words are displayed in varying sizes depending on their frequency in the text. While this is not as useful as term extraction, where terms are extracted based on much more complicated algorithms, it still gives you an idea of what the most frequent words in the text are. This can be useful, for example, for target language vocabulary activation (or when learning a new language?).

One downside, however, is that multi-word terms like “circular economy” are torn apart, so you would need to post-edit the list of words adding a ~ between the words you wish to be kept together.

Another problem is that when using any language other than English, no stop word list is pre-determined (you can add one, though). This means that, for example in German, you end up getting a cloud of der, die, das, und, er, sie, es, aber, weil, doch.

My conclusion: A lot of potential but little real use cases.

Term extraction results

The nicest thing is of course to have an extraction tool with a built-in word cloud generator, like SDL Trados Studio has.

But if you use other term extraction tools, you can still copy the extraction results into the word cloud generator. I used a term list extracted by SketchEngine,  copied in the list of extracted terms plus scores and the result was this:

Multi-word terms are no problem at all, and the size of the terms varies according to the scores calculated by SketchEngine for each term. Much more relevant than frequency in most cases …

My conclusion: Very nice!

PS: If you are interested in terminology extraction for interpreters, Josh Goldsmith is conducting an interesting study on this subject. First results may be expected to be presented in November at the 2nd Cologne Conference on Translation, Interpreting and Technical Documentation (CGN18).

 

About the author

Anja Rütten is a freelance conference interpreter for German (A), Spanish (B), English (C) and French (C) based in Düsseldorf, Germany. She has specialised in knowledge management since the mid-1990s.

Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.