Book Image

Hands-On Machine Learning for Cybersecurity

By : Soma Halder, Sinan Ozdemir
Book Image

Hands-On Machine Learning for Cybersecurity

By: Soma Halder, Sinan Ozdemir

Overview of this book

Cyber threats today are one of the costliest losses that an organization can face. In this book, we use the most efficient tool to solve the big problems that exist in the cybersecurity domain. The book begins by giving you the basics of ML in cybersecurity using Python and its libraries. You will explore various ML domains (such as time series analysis and ensemble modeling) to get your foundations right. You will implement various examples such as building system to identify malicious URLs, and building a program to detect fraudulent emails and spam. Later, you will learn how to make effective use of K-means algorithm to develop a solution to detect and alert you to any malicious activity in the network. Also learn how to implement biometrics and fingerprint to validate whether the user is a legitimate user or not. Finally, you will see how we change the game with TensorFlow and learn how deep learning is effective for creating models and training systems
Table of Contents (13 chapters)
Free Chapter
1
Basics of Machine Learning in Cybersecurity
5
Using Data Science to Catch Email Fraud and Spam

Introduction to our password dataset

Let's begin with the basics. We'll import our dataset and get a sense of the quantity of data that we are working with. We will do this by using pandas to import our data:

# pandas is a powerful Python-based data package that can handle large quantities of row/column data
# we will use pandas many times during these videos. a 2D group of data in pandas is called a 'DataFrame'

# import pandas
import pandas as pd

# use the read_csv method to read in a local file of leaked passwords
# here we specify `header=None` so that that there is no header in the file (no titles of columns)
# we also specify that if any row gives us an error, skip over it (this is done in error_bad_lines=False)
data = pd.read_csv('../data/passwords.txt', header=None, error_bad_lines=False)

Now that we have our data imported, let's call on the...