Book Image

Effective Python Penetration Testing

By : Rejah Rehim
Book Image

Effective Python Penetration Testing

By: Rejah Rehim

Overview of this book

Penetration testing is a practice of testing a computer system, network, or web application to find weaknesses in security that an attacker can exploit. Effective Python Penetration Testing will help you utilize your Python scripting skills to safeguard your networks from cyberattacks. We will begin by providing you with an overview of Python scripting and penetration testing. You will learn to analyze network traffic by writing Scapy scripts and will see how to fingerprint web applications with Python libraries such as ProxMon and Spynner. Moving on, you will find out how to write basic attack scripts, and will develop debugging and reverse engineering skills with Python libraries. Toward the end of the book, you will discover how to utilize cryptography toolkits in Python and how to automate Python tools and libraries.
Table of Contents (16 chapters)
Effective Python Penetration Testing
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface

Web scraping


Even though some sites offer APIs, most websites are designed mainly for human eyes and only provide HTML pages formatted for humans. If we want a program to fetch some data from such a website, we have to parse the markup to get the information we need. Web scraping is the method of using a computer program to analyze a web page and get the data needed.

There are many methods to fetch the content from the site with Python modules:

  • Use urllib/urllib2 to create an HTTP request that will fetch the webpage, and using BeautifulSoup to parse the HTML

  • To parse an entire website we can use Scrapy (http://scrapy.org), which helps to create web spiders

  • Use requests module to fetch and lxml to parse

urllib / urllib2 module

Urllib is a high-level module that allows us to script different services such as HTTP, HTTPS, and FTP.

Useful methods of urllib/urllib2

Urllib/urllib2 provide methods that can be used for getting resources from URLs, which includes opening web pages, encoding arguments, manipulating...