Artificial intelligence is changing businesses at an incredibly fast rate, but this rapid growth of AI systems has also resulted in new risks surrounding cybersecurity. During the past two years, companies globally have seen a significant increase in the number of data breaches associated with AI, involving exposed training datasets, prompt leaks, compromised AI chatbots, and insecure machine-learning infrastructure.
As organizations begin to integrate generative AI into their procedures and operations, security specialists warn that criminals view AI systems as becoming attractive targets for cybercrime, as they process vast quantities of sensitive data such as customer accounts, internal company documents, source code, healthcare data, and financial records.
Because of the rush to implement generative AI within businesses, many businesses are discovering that their traditional security strategies do not always offer the best protection when it comes to the new and modern AI ecosystem.
What are AI-Related Data Breaches?
AI-related data breaches are incidents where AI systems, machine learning systems, or applications using artificial intelligence expose, leak, or lose sensitive data, such as that resulting from cyber attack(s), insecure integration(s), insider threat(s), misconfiguration(s), etc.
There are several examples of AI-related data breaches:
- Compromised AI chatbots
- Exposed training datasets
- Prompt injections
- API key leaks
- Insecure AI extensions and plugins
- Improperly configured cloud storage for AI models
- Stored data poisoning
- Unauthorized access to large language models.
AI cyber attacks are a growing issue. Cybercriminals are starting to use generative AI technologies like ChatGPT as weapons of choice for launching attacks against organizations because they can generate content faster than anyone else could.
Data Breach in AI – Why Are Cybercriminals Attacking AI?
Cybercriminals target AI systems because they typically have a lot of information on consumers, owners, and employees of a business. AI systems typically connect to many enterprise services and therefore have larger attack surfaces than a standalone application.
The following items could be found in the data that is part of a generative AI system:
- Corporate confidential files
- Communications between employees
- The source code of proprietary software
- Customer service interactions
- The company’s financial records
- ” Personally identifiable information (PII)”
- Owners of the data within the company
If a hacker is able to gain access to a generative AI system, the impact of a breach could be much more damaging than standard application-only breaches.
In addition, many companies are rolling out AI systems faster than security teams can perform a risk assessment on these tools; as such, there is often a gap in governance and visibility on the use of AI.
Threats related to Generative AI – Types of Data Breach
Prompt leakage of confidential business information
Employees are not aware of prompt leakage issues when they cut and paste confidential and proprietary business information into an AI tool (e.g., ChatGPT). After companies discovered that employees were uploading proprietary code, contracts, or other internal documents to public AI chatbots, many companies have restricted their employees’ access to use these chatbots.
As a result, if an employee enters the information into a chatbot and the chatbot’s server is hacked or not properly secured, then the hacked information may be lost forever if the AI system deletes the information after the employee’s interaction with the AI system is complete.
Exposure of AI Training Data
AI training datasets are also a significant target, as the effectiveness of any AI models is primarily determined by the quality of the data they are trained on, and this generally means that businesses have to keep large amounts of sensitive data while training their models.
If attackers can breach a company’s training data, they can steal millions of records in one incident.
In addition, some training datasets may also unintentionally expose an individual’s private information when the datasets are poorly sanitized, and the data is used in the production of AI model outputs.
Vulnerability of APIs and Plug-ins
AI systems utilize APIs and third-party plugins to facilitate integration into the enterprise workflow. Weak authentication to API endpoints, exposed tokens, or unprotected integrations into enterprise systems can provide an entry point for attackers to compromise connected systems.
In the past, researchers have demonstrated various ways in which a malicious plugin can manipulate an AI-enabled virtual assistant, steal session data from an enterprise system, and gain unauthorized access to an enterprise application.
Theft of Models and Unauthorized Access to Models
AI models are also valuable assets for businesses, and sophisticated threat actors are increasingly targeting machine learning repositories and machine learning architectures to steal proprietary algorithms or training techniques used to create AI models.
In addition, the attackers may reverse-engineer the AI model to obtain any potentially sensitive embedded data.
Real World Examples of AI-Related Breaches
There have been several large-scale breaches that have underscored the business risk of adopting AI.
In 2023, employees at an electronics company inadvertently published confidential internal meeting notes by posting them to a public AI chatbot service. Following this incident, the company restricted internal generative AI tools for fear of exposing sensitive company data.
In another instance of a major breach, researchers found improperly configured databases within the cloud containing massive amounts of AI training datasets with hundreds of millions of user records publicly exposed to the internet without access restrictions.
Cybersecurity companies also report that they have discovered prompt injection attacks that can manipulate AI assistive technologies into disclosing confidential information or performing actions that they would not normally perform in order to assist users.
Additionally, many threat actors have begun utilizing AI technologies to automate phishing campaigns, create malware, and conduct social engineering attacks, providing them with a significant advantage over their victims and adding to the overall cyber threat landscape.
The Rise of Shadow AI
A new emerging challenge is known as Shadow AI, which is defined as the use of unapproved AI tools by employees to perform work (or utilize AI) without the approval or awareness of their IT or security functions.
Just as with shadow IT in decades past, the number of employees using external AI services as a means to make themselves more productive is rising. Many of these services are not built to comply with enterprise security requirements.
The use of Shadow AI creates multiple risk factors for organizations, including, but not limited to:
- Unmonitored data sharing
- Disclosure of confidential information
- Regulatory non-compliance
- No access controls
- Lack of secure third-party integrations
Many organizations are unaware that they are exposing confidential data to AI in the process.
Cybercriminals are finding new ways to take advantage of AI as an exploitation tool. Some common restaurant attack types are:
Prompt injection attacks allow exploiters to manipulate an AI prompt so they can access restricted content or retrieve hidden data from the underlying AI model.
Data poisoning attacks, where an attacker purposefully introduces faulty and/or corrupt information into the training of the AI model, corrupting the AI model’s behaviour and/or the output of the AI model.
Adversarial attacks involve creating an input that tricks the AI model into producing incorrect results, i.e., circumventing detection mechanisms.
Credential theft attacks target employees or contractors of a company who develop or administer AI systems to recover company API keys, authentication tokens, and/or cloud credentials associated with the machine learning infrastructure.
Supply chain attack vectors expose an organisation’s AI development environment by compromising malicious AI libraries, plugins, and dependencies used in the development of the organisation’s enterprise AI systems.
As governments and regulators around the world begin to take closer looks at the way organisations handling data associated with the introduction and/or use of AI-based systems, organisations deploying AI systems will face increased legal liability consequences (e.g. payments for breaches that expose customer data protected by various privacy acts (e.g. GDPR, HIPAA, CCPA, EU AI Act, various national cybersecurity regulations) as well as new regulators now attempting to require companies to document how AI systems collect, store and process user information.
The consequences of not implementing adequate security controls to mitigate these risks will likely result in organisations facing significant financial penalties, lawsuits, and reputational damage.
How Organizations Can Reduce the Risk of Having Their AI Breached
With the rapid rise of AI technology, cybersecurity experts have suggested that organizations create improved security controls and governance specific to AI environments. Organizations that publish, review, or process large volumes of written content should also add AI detection to their governance workflow, using the best AI detector to flag undisclosed AI-generated text and reduce compliance risk.
Best practices include:
- Tracking down sensitive data uploads into public AI systems
- Developing policies for how users can use AI
- Encrypting training datasets
- Monitoring AI infrastructure regularly
- Auditing third-party AI integrations and plugins
- Applying zero-trust security principles
- Using multi-factor authentication
- Limiting access privileges
- Scanning AI repositories to find any exposed credentials
- Conducting regular security assessments of AI
It is also imperative to train users on security awareness. Human error continues to account for many instances of AI data being exposed due to user error.
The Future of AI Security
As organizations use more generative AI, autonomous agents, and machine-learning systems, the likelihood of cyberthreats related to AI will increase rapidly.
Predictions made by cybersecurity researchers show that future attacks will target:
- Autonomous agents powered by AI
- AI-enabled software-development pipelines
- Copilots in enterprise environments
- AI customer service systems
- Robotics/industrial AI
- Healthcare AI platforms
- Financial trading algorithms
As AI becomes more widely used by defenders to enhance their ability to detect threats, automate incident response activities, and rapidly identify suspicious activity, this has led to a continuing cycle of attackers versus defenders creating a “cybersecurity arms race” that includes both adversary and defender use of AI.
Wrapping Up
Artificial intelligence (AI) changes how we interact with the digital world; however, as AI is rapidly adopted, organizations face new challenges due to attacks on their privacy and data. Attacks using AI are not just hypothetical anymore; they are happening in real-time all over the globe, affecting businesses, governments, and individuals.
Companies now need to consider securing their AI systems & AI infrastructure, as well as the training data sets used to build their AI algorithms, and protecting user interactions just as they would traditional systems and networks.
The companies that will thrive in the AI age will likely be the ones that prioritise their cybersecurity regulations as part of their core business strategy, rather than an afterthought.

