Nicholas Evans, Research Analyst
Could this be about to change? I’d be willing to bet so.
A set of techniques—called adversarial attacks—allow threat actors to target AI systems directly by altering the environments in which those systems operate. These techniques allow attackers to cause disruption, exfiltrate private data, and steal intellectual property without ever gaining privileged access to a system. They are akin to somebody burgling your house without ever stepping foot on your property.
This is not just speculation. Adversarial techniques have been proven to work on a number of real-world systems, with versions of adversarial attacks having been around for years. We don’t yet know the level of danger these techniques pose, but the AI researchers I’ve spoken to are in agreement that adversarial attacks are here to stay.
Machine learning systems are now in widespread use in business critical functions. Recent research from Gartner shows that 41% of business leaders plan to increase spending on these systems this year. Some of these systems enable business critical services, for example automated fraud detection systems in banks. Others provide highly visible experiences for customers, for example recommendation engines on shopping websites. Still others – like navigation systems in autonomous vehicles – make decisions based on continuous environmental data.
In order to attack such a system, the traditional method would be to access it and gain direct control over its function, but with adversarial techniques it is possible to launch an attack on the model even if it appears to you as a ‘black box’.
How is this possible? In all the cases I mentioned above, the machine learning model takes data from its environment and uses it to make decisions. It may also actively continue to learn from that data. In the case of the bank, this is transaction data that consumers have entered; in the case of the shopping website it is data generated as consumers browse the website; in the case of the car, the data gathered is about the physical environment.
The goal of an adversarial attack is to adversely affect the ML model by altering the data it receives. Poisoning the data that the model is trained on can impair its decision making, while carefully altering the inputs can fool it into making choices that further the attacker’s objective.
The adversarial kill chain will be different from the standard model. An attacker is likely to move back and forth between reconnaissance and delivery, observing the relationship between compromised inputs and outputs in an ML model. Doing this will enable them to reach their objective without ever establishing persistence.
Adversarial attacks offer criminals new ways to make money through fraud and manipulation, as well as novel methods of disrupting operations (including of safety critical systems) to extort money. Moreover, these attacks enable adversaries to achieve goals that were previously only possible by ‘entering the property’, including the exfiltration of confidential data or the theft of intellectual property.
Here are my predictions about the attacks we are most likely to see:
As businesses extend their use of AI to make decisions about monetary issues, criminals will learn to use adversarial techniques to fool these models. They will bypass fraud checks on insurance claims. They will manipulate systems that set the prices of goods automatically. They will learn to steal and commit fraud in new and inventive ways. The ambitious may even seek to manipulate stock markets and currencies in their favor.
Attackers will break or subvert the models to cause disruption to a business’s operations. Some possibilities include:
During what is known as a membership inference attack, an adversary can ‘trick’ an ML model into revealing whether a certain bit of data was included in the dataset on which it was trained. This might not seem like a huge problem, but the data sets used for training ML models are often highly confidential and personal. Depending on the data set, this technique could be used to reveal financial or medical information about a person, with no privileged access required.
In the coming years this form of attack will raise big legal questions. Will personal data that has been inferred through this kind of attack count as having been ‘breached’? Will it be subject to GDPR regulations? Will firms be legally liable for what can be inferred from their publicly accessible ML models? Will they be fined? Will ‘inferring’ data in this way even be illegal?
Researchers have shown that simply by querying the public-facing API of a black box system and observing the outputs, it is often possible to reverse engineer the model with remarkable accuracy. This works, for example, on Google Translate. No special access or privilege is required to steal Intellectual Property in this way, and it is unclear if any laws are broken.
Adversarial ML techniques will alter the attack surfaces of vulnerable organizations, and spending priorities may need to change to reflect this. As one of F-Secure's consultants recently to me "with many ML models, you're exposing your valuables because it's a necessary part of getting the work done." The sooner we understand this, the sooner we can start managing the problem.
I am not alone in predicting that we will soon see threat actors using adversarial techniques to bypass AI-powered security systems before launching a traditional cyber-attack. My advice to users of these products is to be aware of this threat and not put all their eggs in one basket: enquire how your vendor works to harden their product against adversarial attacks.
But what about attacks which target business critical ML models directly, and for which there will be no third-party vendor on whose shoulders we can place the blame? In the case of ML models that are directly accessible through an API, or which draw data directly from the ‘outside’, we will see an increase in attempts to alter the environment to cause disruption. Current cyber security approaches which focus mainly on Identity and Access Management will not be able to defend against these ‘intruderless’ attacks. The cyber security battle will be shifted off the estate of organizations, and into the environment in which they operate. The thief many never come through your front door, but by changing the conditions on your road, he might still be successful in robbing you.
Organizations will also have to confront the emergence of a novel attack vector in the form of poisoned data sets (such as market data or customer data) that nonetheless contains nothing that we would traditionally recognize as malicious code.
Combating the full threat of adversarial ML will require an imaginative rethinking of what counts as a ‘cyber attack,’ but there’s still much we can already do to harden our systems against most attacks. Training data can be audited, models can be robustly optimized, and outputs can be subtly ‘fudged’ to frustrate inversion and extraction. Adversarial ML is a threat, but it is one that we can learn to manage – and hopefully, one day, to regulate.