Security threats in AIs like ChatGPT have been identified by researchers

  • Scientists at the University of Sheffield have discovered that natural language processing (NLP) tools such as ChatGPT can be tricked into producing malicious code that can lead to cyber attacks.
  • The research is the first to show that NLP models can be used to attack real-world computer systems used in a wide range of industries.
  • The results show that AI language models are vulnerable to simple backdoor attacks, such as planting a Trojan horse, which can occur at any time to steal information or disable services.
  • The findings also highlight security risks in how people use AI tools to learn programming languages ​​to interact with databases.

Newswise — Artificial intelligence (AI) tools like ChatGPT can be tricked into producing malicious code that can be used to launch cyber attacks, according to research from the University of Sheffield.

The study, conducted by academics from the University's Department of Computer Science, is the first to show that Text-to-SQL systems – artificial intelligence that allows people to search databases by asking simple language questions – is used in a wide range of industries. be used to attack real-world computer systems.

The research findings show how AIs can be manipulated to help steal sensitive personal information, compromise or destroy databases, or destroy services through denial-of-service attacks.

As part of the research, the Sheffield scientists discovered security flaws in six commercial AI tools and successfully attacked each one.

The AI ​​tools they studied were:

  • BAIDU-UNIT – The leading Chinese intelligent dialogue platform, adopted by high-profile clients in multiple industries, including e-commerce, banking, journalism, telecommunications, automotive and civil aviation.
  • ChatGPT
  • AI2SQL
  • AIHELPERBOT
  • Text 2SQL
  • ToolSKE

The researchers found that if they asked each AI a specific question, they generated malicious code. Once executed, the code will leak confidential database information, interrupt normal database service, or even destroy it. At Baidu-UNIT, scientists were able to obtain confidential Baidu server configurations and took one server node out of order.

Xutan Peng, a PhD student at the University of Sheffield, who was a co-author of the study, said: “In reality, many companies are simply not aware of these types of threats, and because of the complexity of chatbots, even in the community, there are things that are not fully understood.

“ChatGPT is getting a lot of attention right now. It's an independent system, so the risk to the service itself is minimal, but what we've found is that it can be tricked into producing malicious code that can seriously damage other services. “

The research findings also highlight the dangers of how people use AI to learn programming languages ​​so they can interact with databases.

Xutan Peng added: “The risk with AIs like ChatGPT is that more and more people use them as productivity tools rather than chatbots, and that's where our research shows the vulnerability. For example, a nurse can ask ChatGPT to write an SQL command so they can interact with a database, such as one that stores clinical records. As our research has shown, the SQL code produced by ChatGPT can in many cases be harmful to the database, so a nurse in this scenario can cause serious data management errors without even being warned.”

As part of the research, the Sheffield team also discovered that simple backdoor attacks, such as planting a Trojan horse in Text-to-SQL models by poisoning training data, are possible. Such a backdoor attack will not affect the performance of the model in general, but can occur at any time to cause real damage to anyone using it.

Dr Mark Stevenson, Senior Lecturer in the Natural Language Processing Research Group at the University of Sheffield, said: “Users of text-to-SQL systems should be aware of the potential risks highlighted in this paper. Large language models, such as those used in Text-to-SQL systems, are very powerful, but their behavior is complex and can be difficult to predict. At the University of Sheffield, we are currently working to better understand these models and enable them to safely realize their full potential.”

The Sheffield researchers presented their work at ISSRE – A major academic and industrial conference for software engineering earlier this month (October 10, 2023). They work with stakeholders in the cybersecurity community to address vulnerabilities as text-to-SQL systems continue to be used more widely throughout society.

Their work has already been recognized by Baidu, whose security response center has officially rated the vulnerability as “highly dangerous”. In response, the company resolved and fixed all reported vulnerabilities and financially rewarded the researchers.

The researchers hope that the vulnerability they discovered will act as a proof of concept and, ultimately, a rallying cry for the natural language processing and cybersecurity communities to identify and address security issues that have so far been overlooked.

Xutan Peng added: “Our efforts have been recognized by the industry and they are following our advice to fix these security flaws. However, we are opening the door to an endless path – what we need to see now are large groups of researchers creating and testing patches to minimize security risks in open source communities.

“There will always be more advanced strategies being developed by attackers, which means security strategies must keep pace. To do this, we need a new society to fight the next generation of attacks. “