The ChatGPT 'black box' problem

If Microsoft, Google and OpenAI refuse to share any parameters of their generative artificial intelligence platforms can we trust them?

The secrecy-by-default culture of “big AI” is a dangerous precedent when considering society’s increased acceptance of generative AI and the possibility the tech could fall prey to bad actors. That’s the argument laid out by Baldur Bjarnason who warned in a recent blog that when AI is a black box it leaves the company’s using the tech vulnerable to a new form of "black-hat keyword manipulation.”

“We’ve known for a long time that AI models can be ‘poisoned,'" Bjarnason wrote in an essay promoting an upcoming book. “If you can get an AI vendor to include a few tailored toxic entries — you don’t seem to need that many, even for a large model — the attacker can affect outcomes generated by the system as a whole.”

He posits that it is entirely possible that bad actors have already poisoned ChatGPT and there are no ways its users could know it.

Bjarnason said we don’t really know for sure because OpenAI doesn’t talk about its language and diffusion models or how it validates prompts it uses for training, how it vets training data sets and how it fine-tunes generated responses.

“[OpenAI’s] secrecy means that we don’t know if ChatGPT has been safely managed,” wrote Bjarnason, author of "The Intelligence Illusion."

Bjarnason has been especially concerned that the major AI companies, including OpenAI, makers of ChatGPT, refuse to give impartial researchers access to their models and training data that’s needed to reliably replicate research and studies. He’s also concerned about the level of hype and false claims around AI, something the industry saw in abundance at the RSA Conference in San Francisco in late April.

“Even if we assume that the tendency towards pseudoscience and poor research isn’t inherent to the culture of AI research and just take for granted that, in a burst of enlightened self-awareness, the entire industry is going to spontaneously fall out of love with nonsense ideas and hyperbolic claims, the secrecy should still bother us,” wrote Bjarnason. “This much secrecy — or, information asymmetry — is a fatal blow to a free market.”

Public interest security blogger Bruce Schneier posted one of Bjarnason’s recent essays on the Schneier on Security blog. The essay generated interest and comments.

Chitter chatter about ChatGPT

One commenter, identified as IsmarGPT argued we risk harmful repercussions as tech elites and a fascinated public push for AI's blind usage, potentially learning a stern lesson the hard way. “AI has [the] potential to cause more damage than good but it, unfortunately looks like that, due mainly to short-term interest of the incumbent tech elite as well as the abundance of the mesmerized populace, we are going to find this out the hard way.”

Another user, identified as Peter, said despite ChatGPT's buzz, skeptics see it as a mere probabilistic tool lacking genuine intelligence. “It’s only a matter of time before someone wrecks ChatGPT. The architects have been very clever about choosing the probability for each successive word, but random it remains. It has neither intelligence nor intention.”

Strong opinions are understandable considering the growing pervasiveness and growth of generative AI. Forecast for the market for artificial intelligence hardware and services is forecast to grow to $90 billion by 2025 — from $36 billion in 2020, according to a market analysis by IDC and Bloomberg.

“We see strong interest from enterprises to integrate conservational AI into their existing ecosystem. As a result, we believe conversational AI’s share in the broader AI’s addressable market can climb to 20% by 2025 ($18 to $20 billion),” according to a February 2023 report by UBS.

All talk, attack vectors

A form of adversarial attack, AI data poisoning happens when attackers manipulate training datasets by injecting poisoned or polluted data to control the behavior of the trained machine learning model and deliver false results.

In a recent essay, Bjarnason detailed 12 instances where AI models have been poisoned — and the attacks span just about every type of AI model.

“They don’t seem to require any special knowledge about the internals of the system,” wrote Bjarnsason. "Black box attacks have been demonstrated to work on a number of occasions — which means that OpenAI’s secrecy is of no help.”

Krishna Vishnubhotla, vice president of product strategy at Zimperium, said while attackers won’t break ChatGPT overnight, they can slowly poison it over time. Vishnubhotla said we will never truly know when that shift occurred to a "broken" state as it will be so subtle and over time.

Bark bigger than bite?

“OpenAI should consider allowing the research community access to their models,” said Vishnubhotla. “Enabling collaboration and knowledge sharing would foster advancements in various societal domains. By opening the model to external scrutiny, valuable insights can be gained, contributing to the model's refinement and mitigating the spread of misinformation. Such a powerful and indispensable technology should serve the greater good of humanity rather than being solely monetized by the private sector.”

John Bambenek, principal threat hunter at Netenrich, said as it exists today, ChatGPT is mostly harmless. Like any tool, the risks matter when people attempt to do some work or task with it.

“Today, we are in the novelty phase,” said Bambenek. “By way of analogy, when Facebook introduced facial recognition, it was also a novelty. When that same technology was applied to policing, we created human rights issues because the systems were less reliable in minority populations. Any AI/ML system should always keep a human in the mix. When we take the human out, it will break in bad ways as it is applied to more important problems than high schoolers trying to cheat on their papers.”

Bambenek added that right now, nobody really knows how to regulate or ensure the security of AI/ML systems. One school of thought is to treat it like encryption with full transparency and significant reviews. That has great appeal, but he’s also concerned that the more we open up AI, the easier, not harder, it will be to attack.

To date, the European Union has passed a draft regulation that would strictly prohibit AI systems with an unacceptable level of risk to people's safety, including systems that deploy subliminal or purposefully manipulative techniques, exploit people's vulnerabilities, or are used to classify people based on their social behavior and socio-economic status.

In the United States, there have been efforts to apply the principles of fairness, nondiscrimination, and safety to AI products for use at federal agencies, as well as general comments by President Joe Biden on the potential benefits and dangers of AI, but no serious national legislation around AI has emerged.