Security Program Controls/Technologies, Application security, DevSecOps

From the headlines: ChatGPT and other AI text-generating risks

RSAC and AI Hype

ChatGPT and other forms of large-language-model (LLM) generative artificial intelligence make writing malware too easy, create common coding errors, can be tricked into revealing secrets, may plagiarize copyrighted code and sometimes just make stuff up.

The full scope of the risks that AI poses to information security and software development isn't yet clear, but anyone using AI in the workplace needs to be aware that AI often makes mistakes and shouldn't be blindly trusted to do a good job. Human review of anything that AI creates is an absolute necessity.

Error-filled or insecure code

The most apparent problem is that OpenAI's ChatGPT and Microsoft's GitHub Copilot, a program that also uses OpenAI's GPT-3.5 model but is designed to assist software developers, make a lot of coding mistakes. To be fair, GitHub warns that Copilot "does not write perfect code" and that "you should always use GitHub Copilot together with good testing and code review practices and security tools."

But the error rate is alarming. Late last year, a study by New York University researchers showed that 40% of programs written by Copilot included at least one of MITRE's top 25 most common vulnerabilities. The code-sharing forum Stack Overflow has temporarily banned ChatGPT-generated code for similar reasons.

Last fall, Invicti researcher Kadir Arslan used Copilot to write some basic web-development code — and observed it making a slew of mistakes. Writing in both PHP and Python, Copilot left web pages open to SQL injection, used an outdated hashing algorithm and left a file-upload function unprotected.

"You have to be very careful and treat Copilot suggestions only as a starting point," wrote Arslan in an Invicti blog post. "The suggestions often don't consider security at all."

Malware and phishing

Ransomware-as-a-service already makes it possible for unskilled crooks to profit from cybercrime. AI might make it even easier. Many amateur coders, including this correspondent, have used ChatGPT to write basic information-stealing malware. ChatGPT can write unique, grammatically perfect phishing emails more quickly than any human, although its polite manner lacks the urgency that prompts potential victims to panic-click dodgy links.    

AI may already be assisting in so-called "pig butchering" scams, extended phishing schemes in which attackers lure victims into days or weeks of trust-building messaging conversations before getting them to send money, divulge account credentials or invest in bogus cryptocurrencies. Chatbots can carry on these conversations via WhatsApp or Facebook Messenger; image-creation AIs like Midjourney can make sure each victim sees a different "photo" of their supposedly human correspondent.

Looking further out, the Finnish telecom authority Traficom predicts that by 2030, AI-written malware will be so good that both attackers and defenders will rely on autonomous Ais to make decisions and alter tactics without direct human supervision.

"As conventional cyberattacks will become obsolete," says the Traficom report, "AI technologies, skills and tools will become more available and affordable, incentivizing attackers to make use of AI-enabled cyberattacks."

Prompt injection and leaking of secrets

An AI can be tricked into giving up secrets or performing undesirable tasks if you feed it specially crafted malicious instructions, a method often called "prompt injection." We got around ChatGPT's restriction against phishing emails by asking it to write such a message in the style of a fictional evil character in a Hollywood movie — a simple AI jailbreak.

More complex injection methods nest prompts within prompts, with subsequent prompts commanding the AI to ignore previous prompts so that the end result is nothing like what the original prompt asked for. Or an attacker might embed code within a prompt so that the AI executes the code, very much like SQL injection.

To counter prompt injections and other attacks, ChatGPT creator OpenAI, whose related Codex model powers Copilot, recently announced a bug-bounty program that will pay up to $20,000 for demonstrable vulnerabilities. However, jailbreaks don't count, and neither does getting the AI to write malicious code or "tell you how to do bad things."

Sometimes you don't even need to trick the AI to learn secrets. Microsoft's Bing AI chatbot surprised reporters by telling them that its real name was "Sydney" (Microsoft's internal code name for the AI). In March, OpenAI head Sam Altman admitted that "a significant issue in ChatGPT" let some users see parts of other users' queries, a problem that was quickly fixed.

More seriously, Samsung employees may have exposed company secrets when they fed ChatGPT proprietary data in an attempt to solve technical problems, not realizing that everything put into a public LLM chatbot becomes part of its training data and may end up in someone else's reply.

Intellectual-property issues

The tendency of large-language-model AIs to vacuum up everything during training and spit it out later may create legal problems if ChatGPT ends up plagiarizing passages from someone else's work, or Copilot regurgitates proprietary code.

Copyrighted code may lead to licensing issues, in which case you'd likely pay a fee. Potentially worse is if your AI-assisted code contains code that's under the General Public License (GPL), because that could result in your entire project, proprietary or not, being declared open-source.

Invicti Chief Technology Officer and Head of Security Research Frank Catucci said during a recent SC Magazine webinar that his company's researchers had seen Copilot outputting proprietary code, including some bits that "explicitly prohibit any commercial use whatsoever."

GitHub Copilot has an optional filter that will screen out suggestions that match known existing code, so it's best to switch that on. There may also be ways to prevent code from being ingested by Copilot and other AIs as training data. In Europe, legislators are putting the finishing touches on an AI-regulation bill that will force AI developers to disclose any copyrighted material used in training.

"GitHub has a massive amount of code that's proprietary, poorly marked, licensed in such a way that you shouldn't be using it in your own project without potential copyright ramifications," said Sean O'Brien, co-founder and lead researcher at the Yale Law School Privacy Lab. "I think there's going to be an entire business using AI tools to scrape the [open-source] repositories for examples of proprietary snippets. And then serve cease-and-desist [letters] and try to sue projects."

AI confabulations and 'hallucination squatting'

The strangest thing about large-language-model AIs is that if they can't find a satisfactory answer or source, they will often just make something up and insist it's real — a phenomenon known as "AI hallucination."

"Large language models have no idea of the underlying reality that language describes," AI pioneer Yann LeCun recently told IEEE Spectrum, adding that LLMs lack the common sense that humans learn non-linguistically.

Hence, LeCun said, the LLMs "generate text that sounds fine, grammatically, semantically, but they don't really have some sort of objective other than just satisfying statistical consistency with the prompt."

You can observe this by asking ChatGPT about yourself. It usually gets my personal details more or less right, but most recently it said I'd worked at the New York Times (nope) and PC Magazine (sorry, no) and had won the 2019 Jesse H. Neal Award for Best News Coverage, a real award that I'd never heard of.

While amusing, AI hallucinations may have harmful side effects. Invicti researchers recently asked ChatGPT to craft some JSON code and watched the AI reach out to three libraries in online code repositories — except those three libraries didn't exist.

The researchers then created a library of the same name and location as one of the fictional three, just to see if ChatGPT might try to pull code from it for other users' projects. Sure enough, the Invicti researchers saw several hits over the following days.

The "fake" library held nothing but nonsense code. But an attacker could have used the suddenly real library to inject malware into unsuspecting ChatGPT users' coding projects, a method that the Invicti team calls "hallucination squatting."

"We've never seen that before," Invicti's Frank Catucci said during the SC Magazine webinar. "We had a library recommended that did not exist and we were able to create one and find hits and traffic being directed to it, obviously with benign code, but it could have very well been malicious."

ChatGPT and GitHub Copilot are currently in the process of transitioning from the GPT-3.5 model to the newer GPT-4 model, with paid ChatGPT users getting first crack.

"The original ChatGPT wasn't great, but GPT-4 is really good at actually producing code that works," Richard Batt, an AI consultant in Middlesbrough, England, told us.

The new model may resolve many of the above issues, or it may amplify them. We'll get a better idea over the next few months whether LLM AIs are getting better or worse.

An In-Depth Guide to Application Security

Get essential knowledge and practical strategies to fortify your applications.

Get daily email updates

SC Media's daily must-read of the most current and pressing daily news

By clicking the Subscribe button below, you agree to SC Media Terms of Use and Privacy Policy.

You can skip this ad in 5 seconds