Call me lazy for letting my students do all the blogging lately … but they just come up with such great stuff! So here’s Lea telling us how she discussed the definition of “technical term” with an AI and had five chatbots extract terminology and other useful information for her.

Guest article by Lea Kortenbusch, student of conference interpreting at TH Köln

Imagine the following scenario: an important interpreting assignment is just about to begin, and only minutes before the event starts, the opening speech lands in the booth. Great! But with just five minutes to spare, how can we make the most of the speech?

This was the exact question I asked myself a few months ago during the Information and Knowledge Management course in the Master’s in Conference Interpreting at TH Köln. It quickly became clear to me that I would turn to artificial intelligence — specifically, chatbots!

To make a small comparison, I selected Google Gemini, Microsoft Copilot, ChatGPT, Perplexity, and DeepSeek as my test subjects.

But what kind of information from the speech is actually useful at such short notice? Certainly, numbers, key names and places, and the most important content points are helpful. However, for my little testing, I needed a focus to make a meaningful comparison. Thus, I decided to focus on just the extraction of technical terms. After all, who wouldn’t appreciate being handed the most complicated terms just in time?

Of course, what counts as a “technical term” depends heavily on context and can be difficult to define. For the purpose of this comparison, we’re defining a technical term as follows: a word that has a specific meaning within a particular field of expertise.

So that brings us to the core question: Which of the chatbots is best suited to this task, and how can we optimise the prompt to get the best results?

Prompt Design

For the design of my prompt, I used the COSTAR framework developed by Sheila Teo (2023) as a theoretical foundation. Sheila won Singapore’s very first GPT-4 prompt engineering competition with this framework (How I Won Singapore’s GPT-4 Prompt Engineering Competition | Towards Data Science). According to Sheila Teo, an effective prompt for chatbots should ideally include the following six components: Context, Objective, Style, Tone, Audience, and Response. Based on this, I drafted my first prompt, which looked like this:

# Context #

You are assisting an interpreter in preparing for an interpreting assignment. Your role is to help prepare materials and offer advice and support if they have any questions. Your working languages are English, Spanish, and German.

# Objective #

Carry out the following tasks based on the attached source text:

## Summarize the 5 most important key points in Spanish. Only refer to the content of the provided text.

## Extract all numbers from the text and list them as bullet points, including the context in which each number appears. Only refer to the content of the provided text.

## Create a table of the 20 most important technical terms. Technical terms are defined as a word that has a specific meaning within a particular field of expertise and is not typically found in everyday language. The table should be bilingual: German and Spanish. Base this entirely on the given text.

## Identify and list all names mentioned in the text in bullet points, including the context in which each name appears. Only refer to the content of the provided text.

## Identify and list all places mentioned in the text in bullet points, including the context in which each location is mentioned. Only refer to the content of the provided text.

# Style #

Maintain the style of the original source text.

# Tone #

Use a neutral tone throughout.

# Audience #

The interpreter has a basic understanding of the topic but is not a subject-matter expert. Your responses should enable them to prepare quickly and effectively for the assignment.

# Response #

Complete all tasks in the order listed above and present your answers in a clear, structured format.

Testing the prompt

Then it was time to test my prompt. As a sample text, I chose a speech by Claudia Sheinbaum delivered at her inauguration on October 1, 2024—a speech we had previously used in our interpreting classes. So, how did the different chatbots perform when carrying out my task?

Perplexity

The extraction of key points, numbers, and names went smoothly, so—just as mentioned earlier—the focus here is the extraction of technical terms.

On my first attempt, Perplexity returned 20 general vocabulary words that I wouldn’t classify as technical at all and therefore weren’t useful for our purpose. I then engaged in a follow-up conversation with Perplexity and asked what it understood by “technical terms”. It proceeded to develop a definition and outlined a clear method that it would follow in future when asked to extract technical terms. This approach was as follows:

I first define what technical terms mean within the given context.
I carefully filter the terms based on this definition.
I present the terms in a bilingual table.
I ensure that only terms that are not part of everyday vocabulary are selected.

Here’s an extract showing a comparison of some of the terms Perplexity delivered before and after our short “discussion”.

Before:

After clarification:

I think the difference in the level of technical language is clear. Based on this, I incorporated the procedure proposed by Perplexity into my prompt and tested this revised version with the other chatbots.

Copilot

I followed the same approach with Copilot as I had with Perplexity, this time using the updated prompt from the start. And to be honest, it didn’t go well on the first try. Copilot also seemed to need a kind of “reflection” or clarification on what exactly constitutes a technical term in order to deliver useful results.

Before:

After clarification:

ChatGPT

Next up was ChatGPT. My experience here was similar to what I had encountered with Copilot: at first, it returned very general terms. But after I gave some feedback, the results improved significantly, and I ended up with a solid list of technical terms.

Gemini

The Gemini chatbot from Google, on the other hand, came as a surprise. Right from the very first attempt, Gemini generated a flawless table of technical terms from the text—and even included relevant definitions without being prompted to do so.

DeepSeek

DeepSeek was the last chatbot I tested using the same method, and I have to admit, the result was rather disappointing. Even after several follow-up prompts and a detailed discussion about the definition of technical terms, DeepSeek continued to provide only general vocabulary. When I pushed for a revision, the chatbot repeatedly insisted that these were the most important technical terms and refused to make another attempt.

Comparison

Chatbot	Pro	Contra
perplexity	– Core content, numbers and names extracted correctly – Technical terms identified directly on the 1st attempt (with 2nd version of the prompt) – Clear communication – File upload supported	– Several attempts needed to achieve a good result
copilot	– Core content, numbers and names extracted correctly – Good technical term extraction after reflecting on the task – File upload supported	– sluggish communication, sometimes repeated the same answers for different questions
ChatGPT	– Core content, numbers and names extracted correctly – Good technical term extraction after reflecting on the task	– No file upload option (free version) – Character limit in the prompt (free version)
Gemini	– Core content, numbers and names extracted correctly – Technical terms identified directly on the 1st attempt – No character limit – One-click export to Google Sheets	– no file upload supported – imprecise communication (e.g. occasional error messages)
Deepseek	– Core content, numbers and names extracted correctly	– imprecise communication – even after multiple attempts, extraction of technical terms was not successful

Conclusion

As we’ve seen, chatbots can be incredibly useful when it comes to preparing texts quickly and on short notice. With a well-crafted prompt, it’s also possible to get solid results when extracting technical terms. However, chatbots often seem to require some kind of “reflection” or clarification on what qualifies as a technical term before they can deliver accurate results.

The chatbots tested here varied in terms of quality and usefulness. In my opinion, Perplexity and Google Gemini are best suited for this task, as they delivered the most efficient and relevant results. That said, it’s important to note that chatbot outputs can vary from one session to the next—there’s never a 100% guarantee of success. And of course, we should always be aware that hallucinations can occur, so it’s crucial to approach results with a critical eye.

Still, why not take advantage of the tools available to us? Give it a try!

____________________________________________

Final prompt: