Book Image

Python for Secret Agents - Volume II - Second Edition

By : Steven F. Lott
Book Image

Python for Secret Agents - Volume II - Second Edition

By: Steven F. Lott

Overview of this book

Python is easy to learn and extensible programming language that allows any manner of secret agent to work with a variety of data. Agents from beginners to seasoned veterans will benefit from Python's simplicity and sophistication. The standard library provides numerous packages that move beyond simple beginner missions. The Python ecosystem of related packages and libraries supports deep information processing. This book will guide you through the process of upgrading your Python-based toolset for intelligence gathering, analysis, and communication. You'll explore the ways Python is used to analyze web logs to discover the trails of activities that can be found in web and database servers. We'll also look at how we can use Python to discover details of the social network by looking at the data available from social networking websites. Finally, you'll see how to extract history from PDF files, which opens up new sources of data, and you’ll learn about the ways you can gather data using an Arduino-based sensor device.
Table of Contents (7 chapters)
6
Index

Studying a log in more detail

A file is the serialized representation for Python objects. In some rare cases, the objects are strings, and we can deserialize the strings from the text file directly. In the case of our web server logs, some of the strings represent a date-time stamp. Also, the size of the transmitted content shouldn't be treated as a string, since it's properly either an integer size or the None object if nothing was transmitted to the browser.

When requests for analysis come in, we'll often have to convert objects from strings to more useful Python objects. Generally, we're happiest if we simply convert everything into a useful, native Python data structure.

What kind of data structure should we use? We can't continue to use a Match object: it only knows about strings. We want to work with integers and datetimes.

The first answer is often to create a customized class that will hold the various attributes from a single entry in a log. This gives the...