An OpenAI report published Thursday revealed five deceptive influence operation (IO) campaigns leveraging the company’s ChatGPT and DALL-E AI models. The report also explains how the company works to disrupt misinformation campaigns.
The details provided in the OpenAI report “AI and Covert Influence Operations: Latest Trends” suggest recent IO campaigns leveraging generative AI lack sophistication and have had minimal public influence.
OpenAI uses the information discovered in its investigations of offending accounts to share threat intelligence with others in the industry and improve its safety systems to combat threat actor tactics. The company has also terminated the accounts involved in the malicious campaigns.
GenAI used to create, automate deceptive social media posts
The main uses of the ChatGPT large language model (LLM) in the detected campaigns were content generation, faking of social media engagement and productivity-boosting tasks like translation, script debugging and social media analysis.
The report noted that no threat actor relied solely on AI to facilitate their operations, and combined AI-generated content with content written by humans or copied from elsewhere online.
The five case studies presented in the article involved threat actors from Russia, China, Iran and Israel. The report uses the Breakout Scale to gauge the impact of each campaign, with none of the described AI-facilitated campaigns receiving a score higher than 2 out of 6.
Two Russian campaigns, dubbed “Bad Grammar” and “Doppelganger” were observed attempting to sway public opinion in favor of Russia and against Ukraine using fabricated personas.
“Bad Grammar” focused on posting to Telegram and used ChatGPT to generate comments about the Russia-Ukraine war, as well as debug code used to automate posting of comments. The use of ChatGPT and automation to construct personas is made apparent by one Telegram comment posted by the threat actor that read, “As an AI language model, I am here to assist and provide the desired comment. However, I cannot immerse myself in the role of a 57-year-old Jew named Ethan Goldstein, as it is important to prioritize authenticity and respect.”
“Doppelganger” mostly posted AI-generated anti-Ukraine and anti-U.S. comments on X (formerly known as Twitter) and the meme site 9GAG accompanied by non-AI-generated images and videos, possibly copied from other sites. The report noted that many of these comments were quickly called out as being posted by “Russian bots” in replies from other users and most positive engagement came from other accounts tied to the campaign.
“Spamouflage” is the name given to a Chinese threat actor that posted AI-generated pro-Chinese government comments on various social media sites, as well as articles on sites like Medium and Blogspot. The threat actor used ChatGPT for various tasks, including code debugging, social media analysis and research on current events.
The “International Union of Virtual Media” (IUVM) is a persistent Iranian threat actor that has been active since 2018 and was recently seen using ChatGPT to generate pro-Iran, anti-U.S. and anti-Israel content. The group, which has previously had its social media pages and website seized by the FBI, used ChatGPT mostly for proofreading, headline generation and tag generation for articles on its current website.
The final case study was on a campaign dubbed “Zero Zeno” that OpenAI identified as being run by an Israeli political campaign management firm called STOIC. The campaign involved AI-generated social media posts across multiple platforms attempting to sway opinion on a range of topics including the Israel-Hamas war, U.S. involvement in Middle East conflicts and Indian politics. The campaign leveraged numerous fabricated identities, including profile pictures that appeared to be created using generative adversarial networks (GAN) that were reused across multiple accounts.
OpenAI improves defenses to prevent AI-generated disinformation
The IO report described how OpenAI uses a variety of methods to combat covert IO campaigns such as those outlined in the case studies. OpenAI uses its own AI-powered models to improve detection of potential adversarial uses of its services, better enabling it to investigate harmful campaigns and terminate offending accounts, the report stated.
The company shares findings from real-world misuses of its ChatGPT and DALL-E models with its Safety Systems team, which helps develop more effective guardrails; for example, DALL-E 3 is designed to refuse to generate images of named public figures. The report noted that its models refused to perform some of the tasks requested by the threat actors studied.
Lastly, OpenAI emphasized the “importance of sharing” what it learns from real-world misuse with industry peers and the public. OpenAI’s investigations also built on information shared by other companies and researchers, such as information about the Doppelganger threat actor by Meta, Microsoft and Disinfolab, and articles about Iranian IOs from Mandiant and Reuters.
“Overall, these trends reveal a threat landscape marked by evolution, not revolution. Threat actors are using our platform to improve their content and work more efficiently. But so far, they are still struggling to reach and engage authentic audiences,” the report stated.