Book Image

Python for Secret Agents - Volume II - Second Edition

By : Steven F. Lott
Book Image

Python for Secret Agents - Volume II - Second Edition

By: Steven F. Lott

Overview of this book

Python is easy to learn and extensible programming language that allows any manner of secret agent to work with a variety of data. Agents from beginners to seasoned veterans will benefit from Python's simplicity and sophistication. The standard library provides numerous packages that move beyond simple beginner missions. The Python ecosystem of related packages and libraries supports deep information processing. This book will guide you through the process of upgrading your Python-based toolset for intelligence gathering, analysis, and communication. You'll explore the ways Python is used to analyze web logs to discover the trails of activities that can be found in web and database servers. We'll also look at how we can use Python to discover details of the social network by looking at the data available from social networking websites. Finally, you'll see how to extract history from PDF files, which opens up new sources of data, and you’ll learn about the ways you can gather data using an Arduino-based sensor device.
Table of Contents (7 chapters)
6
Index

Reading remote files

We've given these functions names such as local_text and local_gzip because the files are located on our local machine. We might want to write other variations that use urrlib.request.urlopen() to open remote files. For example, we might have a log file on a remote server that we'd like to process. This allows us to write a generator function, which yields lines from a remote file allowing us to interleave processing and downloading in a single operation.

We can use the urllib.request module to handle remote files using URLs of this form: ftp://username:password@/server/path/to/file. We can also use URLs of the form file:///path/to/file to read local files. Because of this transparency, we might want to look at using urllib.request for all file access.

As a practical matter, it's somewhat more common to use FTP to acquire files in bulk.