Book Image

Mastering Machine Learning for Penetration Testing

By : Chiheb Chebbi
Book Image

Mastering Machine Learning for Penetration Testing

By: Chiheb Chebbi

Overview of this book

Cyber security is crucial for both businesses and individuals. As systems are getting smarter, we now see machine learning interrupting computer security. With the adoption of machine learning in upcoming security products, it’s important for pentesters and security researchers to understand how these systems work, and to breach them for testing purposes. This book begins with the basics of machine learning and the algorithms used to build robust systems. Once you’ve gained a fair understanding of how security products leverage machine learning, you'll dive into the core concepts of breaching such systems. Through practical use cases, you’ll see how to find loopholes and surpass a self-learning security system. As you make your way through the chapters, you’ll focus on topics such as network intrusion detection and AV and IDS evasion. We’ll also cover the best practices when identifying ambiguities, and extensive techniques to breach an intelligent system. By the end of this book, you will be well-versed with identifying loopholes in a self-learning security system and will be able to efficiently breach a machine learning system.
Table of Contents (13 chapters)

Chapter 3 – Malware Detection with API Calls and PE Headers

  1. Load the dataset using the pandas python library, and this time, add
    the low_memory=False parameter. Search for what that parameter does.
df = pd.read_csv(file_name, low_memory=False)
  1. Prepare the data that will be used for training.
original_headers = list(df.columns.values)
total_data = df[original_headers[:-1]]
total_data = total_data.as_matrix()
target_strings = df[original_headers[-1]]
  1. Split the data with the test_size=0.33 parameter.
train, test, target_train, target_test = train_test_split(total_data, target_strings, test_size=0.33, random_state=int(time.time()))
  1. Create a set of classifiers that contains DecisionTreeClassifier(), RandomForestClassifier(n_estimators=100), and AdaBoostClassifier():
classifiers = [
RandomForestClassifier(n_estimators=100),
DecisionTreeClassifier(),
AdaBoostClassifier()]
...