Book Image

Platform and Model Design for Responsible AI

By : Amita Kapoor, Sharmistha Chatterjee
Book Image

Platform and Model Design for Responsible AI

By: Amita Kapoor, Sharmistha Chatterjee

Overview of this book

AI algorithms are ubiquitous and used for tasks, from recruiting to deciding who will get a loan. With such widespread use of AI in the decision-making process, it’s necessary to build an explainable, responsible, transparent, and trustworthy AI-enabled system. With Platform and Model Design for Responsible AI, you’ll be able to make existing black box models transparent. You’ll be able to identify and eliminate bias in your models, deal with uncertainty arising from both data and model limitations, and provide a responsible AI solution. You’ll start by designing ethical models for traditional and deep learning ML models, as well as deploying them in a sustainable production setup. After that, you’ll learn how to set up data pipelines, validate datasets, and set up component microservices in a secure and private way in any cloud-agnostic framework. You’ll then build a fair and private ML model with proper constraints, tune the hyperparameters, and evaluate the model metrics. By the end of this book, you’ll know the best practices to comply with data privacy and ethics laws, in addition to the techniques needed for data anonymization. You’ll be able to develop models with explainability, store them in feature stores, and handle uncertainty in model predictions.
Table of Contents (21 chapters)
1
Part 1: Risk Assessment Machine Learning Frameworks in a Global Landscape
5
Part 2: Building Blocks and Patterns for a Next-Generation AI Ecosystem
9
Part 3: Design Patterns for Model Optimization and Life Cycle Management
14
Part 4: Implementing an Organization Strategy, Best Practices, and Use Cases

Assessing potential impact and loss due to attacks

In the previous section, we looked at the data threats, risks, and important metrics for consideration while building our ML systems. Now, let us understand the financial losses that organizations have incurred due to data leakage.

AOL data breach

AOL faced a lawsuit in 2006 that resulted in them having to pay at least $5,000 to every person whose data was leaked because of releasing user records that could be accessed through public search APIs (Throw Back Hack: The Infamous AOL Data Leak: https://www.proofpoint.com/us/blog/insider-threat-management/throw-back-hack-infamous-aol-data-leak). This incident happened as the search department mistakenly released a compressed text file holding 20 million keyword search record details of 650,000 users. As users’ Personally Identifiable Information (PII) personally identifiable information was present in the search queries, it was easy to identify and associate an individual holding an account. In addition, very recently, Jason Smathers, an employee of AOL, is known to have sold to a person named Sean Dunaway of Las Vegas a list of 92 million AOL customer account names.

Yahoo data breach

Yahoo encountered a series of data breaches (loss of personal information such as through email) through varying levels of security intrusions between 2012 and 2016, amounting to the leakage of 3 billion records (IOTW: Multiple Yahoo data breaches across four years result in a $117.5 million settlement: https://www.cshub.com/attacks/articles/incident-of-the-week-multiple-yahoo-data-breaches-across-4-years-result-in-a-1175-million-settlement).

The attack in 2014 targeted a different user database, affecting 500 million people and containing a greater detail of personal information such as people’s names, email addresses, passwords, phone numbers, and birthdays. Yahoo settled penalties worth $50 million, with $35 million paid in advance, as a part of the damages (Yahoo Fined $50M Over Data Breach: https://www.pymnts.com/legal/2018/yahoo-fine-personal-data-breach/).

Marriot hotel chain data breach

The Marriot hotel chain was fined £18.4m due to the leak of the personal information (names, contact details, travel information, VIP status) of 7 million guests in the UK in a series of cyber-attacks from 2014 to 2018. Due to the failure to protect personal data and non-conformance with the GDPR, it incurred a hefty fine from the UK’s data privacy watchdog (Marriott Hotels fined £18.4m for data breach that hit millions: https://www.bbc.com/news/technology-54748843).

Uber data breach

Uber was handed a fine of $20,000 over a 2014 data breach in a settlement in New York due to a breach of riders’ data privacy (Uber fined $20K in data breach, ‘god view’ probe: https://www.cnet.com/tech/services-and-software/uber-fined-20k-in-surveillance-data-breach-probe/). The breach occurred in 2014 and exposed 50,000 drivers’ location information through the rider-tracking system.

Google data breach

In 2020, the French data protection authority imposed a fine of $57 million on Google due to the violation of GDPR, because it failed to acknowledge and share how user data is processed in different Google apps, such as Google Maps, YouTube, the search engine, and personalized advertisements. In another data leakage incident, Google was responsible for leaking the private data of 500,000 former Google+ users. This data leak enforced Google to pay US$7.5 million, and compensation between US$5 and US$12 to users with Google+ accounts between 2015 and 2019.

Amazon data breach

Amazon faced different data leak incidents in 2021 (Worst AWS Data Breaches of 2021: https://securityboulevard.com/2021/12/worst-aws-data-breaches-of-2021/). One of the incidents resulted in a fine of 746 million euros (US$887 million) (Amazon hit with US$887 million fine by European privacy watchdog: https://www.cnbc.com/2021/07/30/amazon-hit-with-fine-by-eu-privacy-watchdog-.html) being imposed by a European privacy watchdog, due to violating GDPR. In another incident, misconfigured S3 buckets in AWS amounted to the disruption of networks for considerable periods. S3 files, apart from PII, including names, email addresses, national ID numbers, and phone numbers, could contain credit card details, including CVV codes.

Facebook data breach

In 2018, Facebook received a large penalty of $5 billion, and it needed to investigate and resolve different privacy and security loopholes (Facebook to pay record $5 billion U.S. fine over privacy; faces antitrust probe: https://www.reuters.com/article/us-facebook-ftc/facebook-to-pay-record-5-billion-u-s-fine-over-privacy-faces-antitrust-probe-idUSKCN1UJ1L9). The breach occurred on account of improper usage of PII leaked by Cambridge Analytica, which had gathered information from 50 million profiles on Facebook. Facebook exposed the PII of 87 million people that had been misused by the Cambridge Analytica firm to target ads during an election campaign in 2016.

We can note that data breaches are common and they still occur presently. Some of the biggest providers in search services, retail, travel or hospitality, and transportation systems have been victims of threats and penalties here PII information have been stolen. Some other data breaches between 2019 and 2021 are known to have taken place for organizations such as Volkswagen (whose security breach impacted over 3 million customers) and T-Mobile (where over 50 million customers’ private information, including Social Security numbers, and IMEI and IMSI numbers, was compromised). in attacking iPads and iPhones to steal unique Apple device identifiers (UDIDs) and the device names of more than 12 million devices. The incident occurred when a FBI agent's laptop was hacked to steal 12 million Apple IDs.