Book Image

Machine Learning for Cybersecurity Cookbook

By : Emmanuel Tsukerman
Book Image

Machine Learning for Cybersecurity Cookbook

By: Emmanuel Tsukerman

Overview of this book

Organizations today face a major threat in terms of cybersecurity, from malicious URLs to credential reuse, and having robust security systems can make all the difference. With this book, you'll learn how to use Python libraries such as TensorFlow and scikit-learn to implement the latest artificial intelligence (AI) techniques and handle challenges faced by cybersecurity researchers. You'll begin by exploring various machine learning (ML) techniques and tips for setting up a secure lab environment. Next, you'll implement key ML algorithms such as clustering, gradient boosting, random forest, and XGBoost. The book will guide you through constructing classifiers and features for malware, which you'll train and test on real samples. As you progress, you'll build self-learning, reliant systems to handle cybersecurity tasks such as identifying malicious URLs, spam email detection, intrusion detection, network protection, and tracking user and process behavior. Later, you'll apply generative adversarial networks (GANs) and autoencoders to advanced security tasks. Finally, you'll delve into secure and private AI to protect the privacy rights of consumers using your ML models. By the end of this book, you'll have the skills you need to tackle real-world problems faced in the cybersecurity domain using a recipe-based approach.
Table of Contents (11 chapters)

Using machine learning to detect the file type

One of the techniques hackers use to sneak their malicious files into security systems is to obfuscate their file types. For example, a (malicious) PowerShell script is expected to have an extension, .ps1. A system administrator can aim to combat the execution of all PowerShell scripts on a system by preventing the execution of all files with the .ps1 extension. However, the mischievous hacker can remove or change the extension, rendering the file's identity a mystery. Only by examining the contents of the file can it then be distinguished from an ordinary text file. For practical reasons, it is not possible for humans to examine all text files on a system. Consequently, it is expedient to resort to automated methods. In this chapter, we will demonstrate how you can use machine learning to detect the file type...