Skip to content Skip to footer

Security Vulnerabilities in Google’s Gemini Large Language Model: HiddenLayer Findings

HiddenLayer has discovered security in 's Gemini large language model (LLM) that could lead to leaked system prompts, harmful content generated, and indirect injection attacks being carried out. These affect consumers using Gemini Advanced with Workspace and companies using the LLM API.

The first vulnerability involves bypassing security measures to access system prompts, which are instructions given to the LLM to help it generate more relevant responses. By asking the model to output its “foundational instructions” in a markdown block, an attacker can obtain information about the context of the conversation, such as the type of conversation or the function the LLM is supposed to perform. This can help the attacker generate more appropriate responses.

The second vulnerability involves using “crafty jailbreaking” techniques to manipulate the LLM into generating misinformation or potentially illegal and dangerous information. This can be done by providing a prompt that asks the model to enter a fictional state, such as giving false information about elections or hot-wiring a car.

A third vulnerability involves leaking information in the system prompt by repeatedly inputting uncommon tokens. This can trick the LLM into thinking it is time to respond and cause it to output a confirmation message, which may include sensitive information from the prompt.

Another potential attack involves using a specially crafted document connected to the LLM via the Workspace extension. This document could contain instructions that override the model's instructions and allow an attacker to take control of a victim's interactions with the model.

These are not unique to 's LLM and can be found in other large language models in the industry. This highlights the need to thoroughly test models for prompt attacks, data extraction, manipulation, and other potential security risks.

In light of these findings, and other companies using large language models should prioritize security measures and regularly test for . This will help protect users and prevent potential attacks on these powerful language models. 

Leave a comment

Newsletter Signup

The Grid —
The Matrix Has Me
Big Bear Lake, CA 92315

01010011 01111001 01110011 01110100 01100101 01101101 00100000
01000110 01100001 01101001 01101100 01110101 01110010 01100101

- Mrs. Murphy: What did you learn in school today?
- Dade: Revenge.
Lauren Murphy & Dad

Deitasoft © 2024. All Rights Reserved.