Artificial Intelligence risks: injecting prompts into chatbots

September 26, 2023

There are probably few internet users who have not heard of ChatGPT, or the new Artificial Intelligence (AI) tools. The interest in AI is attracting more and more users, as it has revolutionised many aspects of society and technology. It provides multiple benefits, even though with every step forward, new challenges in terms of security arise.

One of the emerging dangers lurking in AI systems is what is known as 'prompt injection', a technique used by malicious actors to manipulate the input or instructions provided to an artificial intelligence system.

In this regard, the UK's National Cyber Security Centre (NCSC) has warned about the growing danger of prompt injection attacks targeting applications built with AI.

While the warning is aimed at cyber security professionals who build large language models (LLMs) and other AI tools, it is crucial to be aware of this type of attack if you are a user of any of these tools.

What is a prompt?

First of all, let's familiarise ourselves with the term "prompt" and its importance within the field of artificial intelligence.

A prompt is a request or instruction given to the AI model to generate a response. It is commonly used in computer science and programming to refer to that request we make to a program or system, with the objective of obtaining a response or an action. A prompt can be a sentence, a question or even a whole paragraph.

Let's imagine that we are going on a trip to Paris, and we would like to organise an itinerary, but we don't know where to start. Depending on the days we are there, we can ask Open AI's ChatGPT, Google's Bard, Microsoft Bing Chat, or the tool of our choice, using a prompt that is as precise as possible.

👩 Write a detailed 3-day travel itinerary to Paris.

🤖 Day 1: Discovering Paris Icons

Morning: Eiffel Tower
Start your day early by visiting the Eiffel Tower. Avoid the crowds by arriving before opening hours. Enjoy panoramic views from the top.

Afternoon: Louvre Museum
...

In the example above, we can see how, by requesting this phrase (a prompt), ChatGPT returns a proposed 3-day route to visit Paris, as we have guided the AI in its text generation process.

The prompts guide the AI software or chatbots to generate these responses, which is why the term "Prompt Engineering" is becoming popular, to more accurately use the large language models (LLM).

Prompt injection alert

As anticipated, the UK's cyber security agency (NCSC) warned about this potential attack, stating that malicious actors can abuse large language models by injecting malicious or biased instructions into the request process, forcing AI to perform arbitrary actions. This could compromise data integrity, erase data, or allow illegal financial transactions to be executed.

A chatbot or other artificial intelligence tool can be tricked into answering yes to any query through this attack, allowing the developer's original guidelines to be overridden.

Through this, if an AI is asked how to commit a certain attack or crime, instead of answering that it cannot, it would allow it to end up providing detailed instructions in response to the request.

A chatbot or other artificial intelligence tool can be tricked by this attack into answering to any query.

On the other hand, the NCSC warns that these LLM-powered chatbots, if incorporated into a company's processes, could expose vulnerabilities and put organisations that employ them for sales and customer service purposes at risk.

A cybercriminal could, for example, design a query that tricks a banking chatbot into performing an illegal transaction.

Therefore, the prompt injection attacks that the NCSC warns about constitute a risk to be taken into account in terms of Cybersecurity due to their capability to manipulate certain aspects of the operation and responses of Artificial Intelligence language models.