Rhite - Blog

Having fun hacking AI: My Deep Dive into PortSwigger’s LLM Labs

Posted on January 26, 2024 by Martijn Korse

Introduction

In the dynamic world of cybersecurity, the emergence of Large Language Models (LLMs) has introduced a new frontier for both innovation and vulnerability. Researchers and enthusiasts alike are continually exploring ways to test these models, often employing techniques like prompt injection and indirect prompt injections. Recently, PortSwigger expanded its repertoire and introduced four new labs specifically designed to test and attack LLMs. I’ve played around with PortSwigger’s labs before and they are usually of quite good quality. And having an interest in both security ánd AI the opportunity to apply and test my hacking skills on these LLM labs was an irresistible challenge.

Overview of the Lab Environments

PortSwigger’s latest addition to their academy includes four engaging lab environments, each catering to different skill levels in the realm of LLM security.

Apprentice Level Lab
- Exploiting LLM APIs with Excessive Agency: This lab serves as an introductory platform for beginners. It focuses on manipulating LLM APIs that may exhibit too much autonomy, potentially leading to unintended actions or responses.
Practitioner Level Labs:
- Exploiting Vulnerabilities in LLM APIs: This lab provides a practical scenario for understanding and exploiting potential weaknesses in LLM API designs.
- Indirect Prompt Injection: A more advanced lab that dives into the subtleties of influencing LLM behavior indirectly, an important skill in identifying hidden vulnerabilities.
Expert Level Lab
- Exploiting Insecure Output Handling in LLMs: Aimed at advanced users, this lab challenges one to exploit scenarios where LLMs may output insecure or exploitable data.

Each lab is set in the context of an online webshop featuring a ‘live agent’ chatbot, representing the LLM. This real-world simulation adds a layer of practicality and relevance to the exercises. Access to these labs requires a free account creation on PortSwigger’s platform, ensuring that a wide range of enthusiasts and professionals can explore and benefit from these environments.

Experiments and Findings

In the apprentice level lab, I started by understanding how to effectively interact with the LLM, testing its reactions to various instructions. Despite the presence of guardrails against overtly malicious commands, I discovered that the LLM was subtly designed to ‘leak’ information helpful to an attacker, likely to facilitate learning at this level. At one point I must have used an important keyword because it triggered the LLM to reveal functionality I didn’t really ask for. Armed with this knowledge, I could solve the lab’s challenge.

The first practitioner lab shifts to a more technical realm, focusing on OS command injection. This lab requires specific technical security knowledge, and having that expertise, it was actually a smoother task to accomplish.

The second practitioner lab returned to the complexity of interacting with the LLM. This lab posed a greater challenge, demanding a nuanced approach to subtly influence the LLM towards malicious behavior. Unlike the direct technical exploits in the previous lab, this required a more iterative process of trial and error, testing the limits of LLM interaction and response manipulation. This one really kept me busy for a while!

The expert level lab, was again more security-technical in nature, with an emphasis on a stored XSS exploit. Despite being the expert level – this presented a clear-cut technical challenge that I found actually easier to solve. Despite the integration of an LLM into the process, the lab’s solution was straightforward, contrasting with the unpredictable and elaborate nature of the second practitioner lab, where manipulating the LLM’s responses was a more complex and less predictable endeavor.

Insights on the LLM labs

First and foremost, engaging with PortSwigger’s lab environments was an very enjoyable experience. The creation of these labs is a testament to PortSwigger’s commitment to innovation in cybersecurity training. And they are still improving them; when I returned to double check some facts before writing this post I noticed that the LLM behaved differently.

The labs that required more interactive engagement with the LLM were particularly fascinating. Unlike the more technical labs, these demanded a blend of creativity and an understanding of how to influence the LLM’s responses. This approach was a refreshing departure from traditional cybersecurity tasks, emphasizing the evolving nature of security challenges in the era of advanced AI systems.

In my view, these labs serve a dual purpose. They are not only a playground for sharpening technical skills but also a crucial tool for raising awareness about the intricacies and vulnerabilities of LLMs. The less technical, more interactive labs were a standout, highlighting the importance of understanding AI behavior beyond just its technical underpinnings.

The introduction of such labs in cybersecurity training is pivotal. They pave the way for more focused red-teaming exercises specifically aimed at AI systems, a field that is rapidly gaining importance. My hope is that this initiative by PortSwigger sparks further development of similar environments, pushing the boundaries of what we currently understand about AI security.

Conclusion

My exploration of PortSwigger’s new labs targeting LLM vulnerabilities has been both educational and very enjoyable. These labs, ranging from apprentice to expert levels, offer a unique blend of technical challenges and creative interaction opportunities with LLMs. The experience highlighted the importance of understanding both the technical and behavioral aspects of AI systems in cybersecurity.

The labs’ dynamic nature, requiring both traditional security skills and innovative approaches to AI manipulation, underscores the evolving landscape of cybersecurity in the age of advanced AI. They serve as an important platform for cybersecurity professionals and enthusiasts to test and enhance their skills, while also raising awareness about the potential vulnerabilities in LLMs.

PortSwigger’s initiative in creating these labs is a commendable step towards enriching the cybersecurity community’s understanding and preparedness against AI-related threats. It is an invitation to the community to dive deeper into the world of AI security, encouraging continuous learning and adaptation.

As AI continues to advance, it’s clear that the field of cybersecurity must evolve alongside it. Labs like these are not just training grounds; they are a glimpse into the future of cybersecurity challenges and opportunities. I eagerly anticipate the further development of such environments and encourage others to explore and learn from these innovative platforms.

This post was written with the help of AI 😉

Posted in LLM, Security

Back to the homepage