Book Image

Machine Learning Security Principles

By : John Paul Mueller
Book Image

Machine Learning Security Principles

By: John Paul Mueller

Overview of this book

Businesses are leveraging the power of AI to make undertakings that used to be complicated and pricy much easier, faster, and cheaper. The first part of this book will explore these processes in more depth, which will help you in understanding the role security plays in machine learning. As you progress to the second part, you’ll learn more about the environments where ML is commonly used and dive into the security threats that plague them using code, graphics, and real-world references. The next part of the book will guide you through the process of detecting hacker behaviors in the modern computing environment, where fraud takes many forms in ML, from gaining sales through fake reviews to destroying an adversary’s reputation. Once you’ve understood hacker goals and detection techniques, you’ll learn about the ramifications of deep fakes, followed by mitigation strategies. This book also takes you through best practices for embracing ethical data sourcing, which reduces the security risk associated with data. You’ll see how the simple act of removing personally identifiable information (PII) from a dataset lowers the risk of social engineering attacks. By the end of this machine learning book, you'll have an increased awareness of the various attacks and the techniques to secure your ML systems effectively.
Table of Contents (19 chapters)
1
Part 1 – Securing a Machine Learning System
5
Part 2 – Creating a Secure System Using ML
12
Part 3 – Protecting against ML-Driven Attacks
15
Part 4 – Performing ML Tasks in an Ethical Manner

Adding security to ML

Security is a necessary component of ML to ensure that results received from an analysis reflect reality. Otherwise, decisions made based on the analysis will be flawed. If the mistake made based on such analysis merely affected the analyst, then the consequences might not be catastrophic. However, ML affects people – sometimes large groups of people. When the effects are large enough, businesses fold, lawsuits ensue, and people lose faith in the ability of ML applications to produce reliable results. Adding security ensures the following:

  • Reliability
  • Verifiability
  • Repeatability
  • Transparency
  • Confidence
  • Consistency

Let’s examine how security can impact ML in more detail.

Defining the human element

At this point, it’s important to take a slight detour from the technical information presented so far to discuss the human element. Even if the processes are clear, the data is clean, the algorithms are chosen correctly, and the code is error-free, humans still provide the input and interpret the result. Humans are an indirect source of security issues in all ML scenarios. When working with humans, it’s essential to consider the five mistruths that creep into every part of the ML environment and cause security issues that are difficult or sometimes impossible to find:

  • Commission: Performing specific and overt engagement in a mistruth and supplying incorrect data. However, a mistruth of commission need not always imply an intent to mislead. Sometimes these mistruths are the result of a lack of information, incorrect information, or a need to please others. In some cases, it’s possible to detect mistruths of commission as outliers in plotted results created during analysis. Mistruths of commission create security issues by damaging the data used for analysis and therefore corrupting the model.
  • Omission: Leaving out essential details that would make the resulting conclusions different. In many cases, the person involved simply forgets to provide the information or is unaware of it. However, this mistruth also makes an appearance when the facts are inconvenient. In some cases, it’s possible to detect this sort of mistruth during missingness checks of the data or in considering the unexpected output of an algorithm. Mistruths of omission create security issues by creating holes in the data or by skewing the model.
  • Bias: Seeing the data or results in an unrealistic or counterintuitive manner due to personal concerns, environmental pressures, or traditions. Human biases often keep the person involved from seeing the patterns and outcomes that are obvious when the bias isn’t present. Environmental pressures, including issues such as tiredness, are hard to overcome and spot. The same checks that work for other kinds of bias can help root out human biases in data. Mistruths of bias create security issues by skewing the model and possibly causing the model to overfit or underfit the data.
  • Perspective: Viewing the data based on experience, environmental conditions, and available information. In reviewing the statements of witnesses to any event, it’s possible to obtain different stories from each witness, even when the witnesses are being truthful from their perspective. The same is true of ML data, algorithms, and output. Different people will see the data in different ways and it’s nearly impossible to say that one perspective is correct and another incorrect. In many cases, the only way to handle this issue is to create a consensus opinion, much as interviewers do when speaking to witnesses to an event. Mistruths of perspective cause security issues by limiting the effectiveness of the model in providing a correct solution due to the inability of computers to understand anything.
  • Frame of reference: Conveying information to another party incorrectly because the other party lacks the required experience. This kind of soft knowledge is precisely why humans are needed to interpret the analysis provided through ML. A human who has had a particular experience understands the experience and recognizes the particulars of it, but is unable to articulate the experience in a concrete manner. Mistruths of frame of reference create security issues by causing the model to misinterpret situational data and render incorrect results.

Now that you have a better idea of how humans play into the data picture, it’s time to look more at the technical issues, keeping the human element in mind.

Compromising the integrity and availability of ML models

In many respects, the ML model is a kind of black box where data goes in and results come out. Many countries now have laws mandating that models become more transparent, but still, unless you want to spend a great deal of time reviewing the inner workings of a model (assuming you have the knowledge required to understand how they work at all), it still amounts to a black box. The model is the weakest point of an ML application. It’s possible to verify and validate data, and understanding the algorithms used need not prove impossible. However, the model is a different story because the only practical ways to test it are to use test data and perform some level of verification and validation. What happens, however, if hackers or others have compromised the integrity of the model in subtle ways that don’t affect all results, just some specific results?

The integrity of a model doesn’t just involve training it with correct data but also involves keeping it trained properly. Microsoft’s Tay (see https://spectrum.ieee.org/tech-talk/artificial-intelligence/machine-learning/in-2016-microsofts-racist-chatbot-revealed-the-dangers-of-online-conversation) is an example of just how wrong training can go when the integrity of the model is compromised. In Tay’s case, unregulated Twitter messages did all the damage in about 16 hours. Of course, it took a lot longer than that to initially create the model, so the loss to Microsoft was immense. To the question of why internet trolls damaged the ML application, the answer of because they can seems trite, but ends up being on the mark. Microsoft created a new bot named Zo that fared better but was purposely limited, which serves to demonstrate there are some limits to ML.

The problem of discerning whether someone has compromised a model becomes greater for pre-trained models (see https://towardsdatascience.com/4-pre-trained-cnn-models-to-use-for-computer-vision-with-transfer-learning-885cb1b2dfc for examples). Pre-trained models are popular because training a model is a time-consuming and sometimes difficult process (pretrained models are used in a process called transfer learning where knowledge gained solving one problem is used to solve another, similar problem. For example, a model trained to recognize cars can be modified to recognize trucks as well). If you can simply plug a model into your application that someone else has trained, the entire process of creating the application is shorter and easier. However, pre-trained models also aren’t under your direct control and you have no idea of precisely how they were created. There is no way to completely validate that the model isn’t corrupted in some way. The problem is that datasets are immense and contain varied information. Creating a test harness to measure every possible data permutation and validate it is impossible.

In addition to integrity issues, ML models can also suffer performance and availability issues. For example, greedy algorithms can get stuck in local minima. Crafting data such that it optimizes the use of this condition to cause availability problems could be a form of attack. Because the data would appear correct in every way, data checks are unlikely to locate these sorts of problems. You’d need to use some sort of tuning or optimization to reduce the risk of such an attack. Algorithm choice is important when considering this issue. The easiest way to perpetrate such attacks is to modify the data at the source. However, making the attack successful would require some knowledge of the model, a level of knowledge known as white box access.

A major issue that allows for integrity and availability attacks is the assumption on the part of humans (even designers) that ML applications think in the same way as we do, which couldn’t be further from the truth. As this book progresses, you will discover that ML isn’t anything like a thought process—it’s a math process, which means treating adversarial attacks as a math or data problem. Some researchers have suggested including adversarial data in the training data for algorithms so that the algorithm can learn to spot them (learn, in this case, is simply a shortcut method of saying that the model has weights and variables adjusted to process the data in a manner that allows for a correct output result). Of course, researchers are looking into this and many other solutions for dealing with adversarial attacks that cause integrity - and performance-type problems. There is currently no silver bullet solution.

Describing the types of attacks against ML

The introduction to this chapter lists a number of attack types on data (such as data bias) and the application (evasion). The previous section lists some types of attacks perpetrated against the underlying model. As a collection, all of these attacks are listed as adversarial attacks, wherein the ML application as a whole tends not to perform as intended and often does something unexpected. This isn’t a new phenomenon—some people experimented with adversarial attacks as early as 2004 (see https://dl.acm.org/doi/10.1145/1014052.1014066 for an article on the issue). However, it has become a problem because ML and deep learning are now deeply embedded within society in such a way that even small problems can lead to major consequences. In fact, sites such as The Daily Swig (https://portswigger.net/daily-swig/vulnerabilities) follow these vulnerabilities because there are too many for any single individual to track.

Underlying the success of these attacks is that ML essentially relies on statistics. The transformation of input values to the desired output, such as the categorization of a particular sign as a stop sign, relies on the pixels in a stop sign image relating statistically well enough to the model’s trained values to make it into the stop sign category. By adding patches to a stop sign, it no longer matches the learned pattern well enough for the model to classify it as a stop sign. Because of the misclassification, a self-driving car may not stop as required but run right through the stop sign, causing an accident (often to the hacker’s delight).

Several elements come into play in this case. A human can look at the sign and see that it’s octangular, red, and says Stop, even if someone adds little patches to it. In addition, humans understand the concept of a sign. An ML application receives a picture consisting of pixels. It doesn’t understand signs, octangular or otherwise, the color red, or necessarily read the word Stop. All the ML application is able to do is match the object in a picture created with pixels to a particular pattern it has been trained to statistically match. As mentioned earlier in the chapter, machines don’t think or feel anything—they perform computations.

Modifying a street sign is an example of an overt attack. ML is even more susceptible to overt attacks. For example, the article at https://arxiv.org/pdf/1801.01944.pdf explains how to modify a sound file such that it embeds a command in the sound file that the ML application will recognize, but a human can’t even hear. The commands could do something innocuous, such as turn the speaker volume up to maximum, but they could perform nefarious tasks as well. Just how terrible the attack becomes depends on the hacker’s knowledge of the target and the goal of the attack. Someone’s smart speaker could send commands to a voice-activated security system to turn the system off when the owner isn’t at home, or perhaps it could trigger an alarm, depending on what the hacker wants (read Attackers can force Amazon Echos to hack themselves with self-issued commands at https://arstechnica.com/information-technology/2022/03/attackers-can-force-amazon-echos-to-hack-themselves-with-self-issued-commands/ to get a better understanding of how any voice-activated device can be hacked).

Attacks can affect any form of ML application. Simply changing the order of words in a text document can cause an ML application to misclassify the text (see the article at https://arxiv.org/abs/1812.00151). This sort of attack commonly thwarts the activities of spam and sentiment detectors but could be applied to any sort of textual documentation. Most experts classify this kind of attack as a paraphrasing attack. (See the Developing a simple spam filter example section of Chapter 4, Considering the Threat Environment, for details on working with text.) When you consider how much automated text processing occurs because there is simply too much being generated for humans to handle alone, this kind of attack can take on monumental proportions.

Considering what ML security can achieve

The essential goal of ML security is to obtain more consistent, reliable, trustworthy, and unbiased results from ML algorithms. Security focuses on creating an environment where the data, algorithm, responses, and analysis all combine to allow ML to produce believable and useful results. The security used with ML applications must perform these tasks in a manner that doesn’t slow the application perceptibly or force it to use huge amounts of additional resources. To accomplish these goals, the users of ML applications need to do the following:

  • Set understandable and achievable result goals that are verifiable, consistent, and answer specific needs
  • Train personnel (which means everyone in the organization, along with consultants and third parties) to interact with the application and its data appropriately
  • Ensure that data passes all of the requirements for proper format, lack of missing elements, absence of bias, and lack of various forms of corruption
  • Choose algorithms that actually perform tasks in a manner that will match the goals set for the ML application
  • Use training techniques that create a reliable model that won’t overfit or underfit the data
  • Perform testing that validates the data, algorithms, and models used for the ML application
  • Verify the resulting application using real-world data that the ML application hasn’t seen in the past

Once an ML application meets all of these requirements, it can provide reliable results more quickly and consistently than humans can for mundane, repeatable tasks. Over time, the humans using an ML application should develop the trust required to make using the application worthwhile. In addition, humans can now move on to other areas of interest, making it possible for a single person to accomplish a great deal more than would otherwise be reasonable. Now that you have a good overview of the technical aspects of ML security, it’s time to get a development environment together so you can work with the book’s code.