Can computers outperform human interpreters?

Unlike many people in the translation industry, I like to imagine that one day computers will be able to interpret simultaneously between two languages just as well as or better than human interpreters do, what with artificial neuronal neurons and neural networks‘ pattern-based learning. After all, once hardware capacity allows for it, an artificial neural network will be able to hear and process many more instances of spoken languages and the underlying content than my tiny brain will in all its lifetime. So it may recognise and understand the weirdest accents and the most complicated matter just because of the sheer amount of information it has processed before and the vast ontologies it can rely on (And by that time, we will most probably not only be able to use digital interpreters, but also digital speakers).

The more relevant question by then might rather be if or when people will want to have digital interpretation (or digital speakers in the first place). How would I feel about being replaced by a machine interpreter, people often ask me over a cup of coffee during the break. Actually, the more I think about it, the more I realise that in some cases I would be happy to be replaced by a machine. And it is good old Friedemann Schulz von Thun I find particularly helpful when it comes to explaining when exactly I find that machine interpreters might outperform (out-communicate, so to say) us humans (or machine speakers outperform humans).

As Friedemann Schulz von Thun already put it back in 1981 in his four sides model (https://en.wikipedia.org/wiki/Four-sides_model), communication happens on four levels:

The matter layer contains statements which are matter of fact like data and facts, which are part of the news.

In the self-revealing or self-disclosure layer the speaker – conscious or not intended – tells something about himself, his motives, values, emotions etc.

In the relationship layer is expressed resp. received, how the sender gets along with the receiver and what he thinks of him.

The appeal layer contains the desire, advice, instruction and effects that the speaker is seeking for.

We both listen and speak on those four layers, be it on purpose or inadvertently. But what does that mean for interpretation?

In terms of technical subject matter, machine interpretation may well be superior to humans, whose knowledge base despite the best effort will always be based on a relatively small part of the world knowledge. Some highly technical conferences consist of long series of mon-directional speeches given just for the sake of it, at a neck-breaking pace and with no personal interaction whatsoever. When the original offers little „personal“ elements of communication (i.e. layer 2 to 4) in the first place, rendering a vivid and communicative interpretation into the target language can be beyond what human interpretation is able to provide. In these cases, publishing the manuscript or a video might serve the purpose just as well, even more so in the future with increasing acceptance of remote communication. And if a purely „mechanical“ translation is what is actually needed and no human element is required, machine interpreting might do the job just as well or even better. The same goes e.g. for discussions of logistics (“At what time are you arriving at the airport?”) or other practical arrangements.

But what about the three other, more personal and emotional layers? When speakers reveal something about themselves and listeners want to find out about the other person’s motives, emotions and values or about what one thinks of the other, and it is crucial to read the message between the lines, gestures and facial expressions? When the point of a meeting is to build trust and understanding and, consequently, create a relationship? Face to face meetings are held instead of phone calls or video conferences in order to facilitate personal connections and a collective experience to build upon in future cooperation (which then may work perfectly well via remote communication on more practical or factual subjects). There are also meetings where the most important function is the appeal. The intention of sales or incentive events generally is to have a positive effect on the audience, to motivate or inspire them.

Would these three layers of communication, which very much involve the human nature of both speakers and listeners, work better with a human or a machine interpreter in between? Is a human interpreter better suited to read and convey personality and feelings, and will human interaction between persons work better with a human intermediary, i.e. a person? Customers might find an un-human interpreter more convenient, as the interpreter’s personality does not interfere with the personal relation of speaker and listener (but obviously not provide any facilitation either). This “neutral” interpreting solution could be all the more charming if it didn’t happen orally, but translation was provided in writing, just like subtitles. This would allow the voice of the original speaker to set the tone. However, when it comes to the „unspoken“ messages, the added value of interpreters is in their name: They interpret what is being said and constantly ask the question „What does the speaker mean or want?“ Mood, mocking, irony, reference to the current situation or persons present etc. will most probably not be understood (or translated) by machines for a long time, or never at all. But I would rather leave it to philosophers or sci-fi people to answer this question.

About the author:
Anja Rütten is a freelance conference interpreter for German (A), Spanish (B), English (C) and French (C) based in Düsseldorf, Germany. She has specialised in knowledge management since the mid-1990s.