Book Image

Mastering Machine Learning for Penetration Testing

By : Chiheb Chebbi
Book Image

Mastering Machine Learning for Penetration Testing

By: Chiheb Chebbi

Overview of this book

Cyber security is crucial for both businesses and individuals. As systems are getting smarter, we now see machine learning interrupting computer security. With the adoption of machine learning in upcoming security products, it’s important for pentesters and security researchers to understand how these systems work, and to breach them for testing purposes. This book begins with the basics of machine learning and the algorithms used to build robust systems. Once you’ve gained a fair understanding of how security products leverage machine learning, you'll dive into the core concepts of breaching such systems. Through practical use cases, you’ll see how to find loopholes and surpass a self-learning security system. As you make your way through the chapters, you’ll focus on topics such as network intrusion detection and AV and IDS evasion. We’ll also cover the best practices when identifying ambiguities, and extensive techniques to breach an intelligent system. By the end of this book, you will be well-versed with identifying loopholes in a self-learning security system and will be able to efficiently breach a machine learning system.
Table of Contents (13 chapters)

Building your own IDS

By now, you know the different network anomaly detection techniques. We are now going to build our own network IDS with Python, from scratch. The University of California hosted a competition called The Third International Knowledge Discovery and Data Mining Tools Competition, and they provided a dataset called KDD Cup 1999 Data, or KDD 1990. You can find it at http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.

The main aim of the competition was building a system that was able to distinguish between bad (attack) and good (normal) connections. Many modern proposals and machine learning solutions were made using the dataset. But as you can see, the dataset is old; the models were not able to detect modern network attacks, in addition to other issues, like data redundancy. A great study called A Detailed Analysis of the KDD CUP 99 Data Set, done by Mahbod...