Prompt Injection Scanners, Better AI Jailbreaks, Purple Llama, Linux Kernel Security – ASW #266
Benchmarking prompt injection scanners, using generative AI to jailbreak generative AI, Meta's benchmark for LLM risks, tapping a protocol to hack Magic the Gathering, and more!
Hosts
- 1. I Hacked Magic the Gathering: Arena for a 100% Winrate
I started out in MTG during the late 90s and still have many of my second edition cards, old-school dual lands, and expansions like Antiquities. And I recently rekindled my MTG collection with the LotR and Doctor Who sets. So this article appealed to me on many fronts.
I also love the point about looking at protocols. Protocols tend to be really easy state machines to reason about and create abuse cases for. Even if you've never tapped mana, you'll find useful appsec insights in this writeup.
- 2. Apache Fixes Critical Struts Flaw | Decipher
I really just included this for a word association game -- struts, Equifax, and directory traversal. (Check out episode 256 for updated details on Equifax's breach.)
There's nothing much to add here. CVE-2023-50164 has high-level details. As with all directory traversal issues, my immediate questions go to (1) why was a path parameter or destination influenceable and (2) why wasn't the upload destination sandboxed and (3) why does the upload need to go to a filesystem instead of a datastore?
- 3. Researchers Detail Sierra Wireless Router Bugs | Decipher
Almost two dozen vulns reported by Forescout. Check out their blog post for more details (it's a little market-y, but that's understandable with vendor research).
The primary thing that stands out to me is this phrase from the Decipher article, "Meanwhile, TinyXML is an abandoned project that has not been maintained for almost a decade..."
There's adding Rust-based software to the supply chain and then there's adding rusty, decomposed software to the supply chain...
- 4. How Do Prompt Injection Scanners Perform? A Benchmark.
We have a bunch of AI-related articles this week that make me think of the early 2000s era of web security -- crafting scanners for a particular vuln class, evaluating scanners, expanding the scanners to all sorts of edge cases of how a vuln might be exploited, and top 10 lists.
Prompt injection is fun, but it feels like the XSS of LLM-based chatbots. There may be some situations where the impact is indeed consequential, but a lot of it feels like demonstrating cleverness and that the vuln class itself needs to be eradicated by a fundamental design shift as opposed to a cat-and-mouse game of blocking particular exploits.
- 5. Purple Llama – AI at Meta
Check out the "benchmark for evaluating the cybersecurity risks of large language models" and its repo.
The mascot looks kind of like a D&D character. I think it's probably a bard.
- 6. Using AI to Automatically Jailbreak GPT-4 and Other LLMs in Under a Minute — Robust Intelligence
This article goes along with the prompt injection scanner review. In this case, researchers found ways to use generative AI to refine jailbreaks without having to rely on seeding the attacking AI with a known jailbreak.
It still feels like the early XSS-is-everywhere era of web security. There will be situations where a generative AI that creates abusive or harassing content due to prompt injections or jailbreaks is meaningful and needs to be addressed, but I doubt any chatbot has any meaningful instructions on how to build a bomb -- it's a great eye-catching example, though.
- 7. The Case for Memory Safe Roadmaps | CISA
I didn't just want to talk about vulns and AI this week. Here's more from CISA on the them of secure design and using languages like Rust to eradicate a class of vulns. It's an update post, which I also appreciate -- updating guidance and justifications and roadmaps keeps references like this relevant.
- 8. Building end-to-end security for Messenger – Engineering at Meta
Here's a chance to talk about secure design for very a security-specific feature.
- 9. The Obvious, the Normal, and the Advanced: A Comprehensive Analysis of Outlook Attack Vectors – Check Point Research
I always like articles that help shed light on known attack vectors and that highlight attack surface where more research may find yet new vulns. We saw that about two years ago with ProxyLogon.
- 10. Kernel security now: Linux’s unique method for securing code | ZDNET
This is an important read less for the technical aspects and more for the social approach to security and architecture. Are security bugs more equal than others? Is noting the security dimension of a bugfix important? Is any kernel bug a security bug? Is the Linux kernel a local privilege escalation vuln by design?
Ok, I added that last question just to see if you were paying attention. But read the article. You might disagree with the team's decisions, but it's very useful to understand how they reach those decisions.
- 1. 38% of log4j apps still use vulnerable version.
...and 79% of developers never update 3rd party libraries after including them in their code base.
- 2. Accenture takes an “industrialized approach” to safeguarding cloud controls
TL;DR: The Toyota Way is cool.
I'm sharing this story for the thought process, and for managers - which parts of TTW make sense to use in our teams, for appsec, cloud sec, or whatever? How much of this is obvious and your teams already do, vs what might actually help?
- 3. [paywall] Why it took Meta 7 years to turn on e2e encryption
Sorry for a paywall article - The jist here is enabling end to end security in a very usable manner for millions of users across several applications is not nearly as easy as adding a few encryption and decryption calls.
- 4. Death to companies that leak massively sensitive information?
A ranty think piece on TC from one of my favorite writers there asks if a company has a massive breach, like 23AndMe or Equifax, should that company face the ultimate penality?
I bring this here as these type of breaches are frequently appsec related, and the question becomes how much effort is too much to keep our applications secure?