Python Digital Forensics Cookbook

By : Chapin Bryce, Preston Miller

Python Digital Forensics Cookbook

By: Chapin Bryce, Preston Miller

Overview of this book

Technology plays an increasingly large role in our daily lives and shows no sign of stopping. Now, more than ever, it is paramount that an investigator develops programming expertise to deal with increasingly large datasets. By leveraging the Python recipes explored throughout this book, we make the complex simple, quickly extracting relevant information from large datasets. You will explore, develop, and deploy Python code and libraries to provide meaningful results that can be immediately applied to your investigations. Throughout the Python Digital Forensics Cookbook, recipes include topics such as working with forensic evidence containers, parsing mobile and desktop operating system artifacts, extracting embedded metadata from documents and executables, and identifying indicators of compromise. You will also learn to integrate scripts with Application Program Interfaces (APIs) such as VirusTotal and PassiveTotal, and tools such as Axiom, Cellebrite, and EnCase. By the end of the book, you will have a sound understanding of Python and how you can use it to process artifacts in your investigations.

Preface

What this book covers

What you need for this book

Free Chapter

Essential Scripting and File Information Recipes

Introduction

Handling arguments like an adult

Iterating over loose files

Recording file attributes

Copying files, attributes, and timestamps

Hashing files and data streams

Keeping track with a progress bar

Logging results

Multiple hands make light work

Creating Artifact Report Recipes

Introduction

Using HTML templates

Creating a paper trail

Working with CSVs

Visualizing events with Excel

Auditing your work

A Deep Dive into Mobile Forensic Recipes

Introduction

Parsing PLIST files

Handling SQLite databases

Identifying gaps in SQLite databases

Processing iTunes backups

Putting Wi-Fi on the map

Digging deep to recover messages

Extracting Embedded Metadata Recipes

Introduction

Extracting audio and video metadata

The big picture

Mining for PDF metadata

Reviewing executable metadata

Reading office document metadata

Integrating our metadata extractor with EnCase

Networking and Indicators of Compromise Recipes

Introduction

Getting a jump start with IEF

Coming into contact with IEF

Beautiful Soup

Going hunting for viruses

Gathering intel

Totally passive

Reading Emails and Taking Names Recipes

Parsing PST and OST mailboxes

Log-Based Artifact Recipes

Introduction

About time

Parsing IIS web logs with RegEx

Going spelunking

Interpreting the daily.out log

Adding daily.out parsing to Axiom

Scanning for indicators with YARA

Working with Forensic Evidence Container Recipes

Introduction

Opening acquisitions

Gathering acquisition and media information

Iterating through files

Processing files within the container

Searching for hashes

Exploring Windows Forensic Artifacts Recipes - Part I

Introduction

One man's trash is a forensic examiner's treasure

A sticky situation

Reading the registry

Gathering user activity

The missing link

Searching high and low

Exploring Windows Forensic Artifacts Recipes - Part II

Introduction

Parsing prefetch files

A series of fortunate events

Indexing internet history

Shadow of a former self

Dissecting the SRUM database

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Iterating over loose files

Recipe Difficulty: Easy

Python Version: 2.7 or 3.5

Operating System: Any

Often it is necessary to iterate over a directory and its subdirectories to recursively process all files. In this recipe, we will illustrate how to use Python to walk through directories and access files within them. Understanding how you can recursively navigate a given input directory is key as we frequently perform this exercise in our scripts.

Getting started

All libraries used in this script are present in Python's standard library. The preferred library, in most situations, for handling file and folder iteration is the built-in os library. While this library supports many useful operations, we will focus on the os.path() and os.walk() functions. Let’s use the following folder hierarchy as an example to demonstrate how directory iteration works in Python:

SecretDocs/
|-- key.txt
|-- Plans
|   |-- plans_0012b.txt
|   |-- plans_0016.txt
|   `-- Successful_Plans
|       |-- plan_0001.txt
|       |-- plan_0427.txt
|       `-- plan_0630.txt
|-- Spreadsheets
|   |-- costs.csv
|   `-- profit.csv
`-- Team
    |-- Contact18.vcf
    |-- Contact1.vcf
    `-- Contact6.vcf

4 directories, 11 files

How to do it…

The following steps are performed in this recipe:

Create a positional argument for the input directory to scan.
Iterate over all subdirectories and print file paths to the console.

How it works…

We create a very basic argument handler that accepts one positional input, DIR_PATH, the path of the input directory to iterate. As an example, we will use the ~/Desktop path, the parent of SecretDocs, as the input argument for the script. We parse the command-line arguments and assign the input directory to a local variable. We’re now ready to begin iterating over this input directory:

from __future__ import print_function
import argparse
import os

__authors__ = ["Chapin Bryce", "Preston Miller"]
__date__ = 20170815
__description__ = "Directory tree walker"

parser = argparse.ArgumentParser(
    description=__description__,
    epilog="Developed by {} on {}".format(
        ", ".join(__authors__), __date__)
)
parser.add_argument("DIR_PATH", help="Path to directory")
args = parser.parse_args()
path_to_scan = args.DIR_PATH

To iterate over a directory, we need to provide a string representing its path to os.walk(). This method returns three objects in each iteration, which we have captured in the root, directories, and files variables:

root: This value provides the relative path to the current directory as a string. Using the example directory structure, root would start as SecretDocs and eventually become SecretDocs/Team and SecretDocs/Plans/SuccessfulPlans.
directories: This value is a list of sub-directories located within the current root location. We can iterate through this list of directories, although the entries in this list will become part of the root value during successive os.walk() calls. For this reason, the value is not frequently used.
files: This value is a list of files in the current root location.

Be careful in naming the directory and file variables. In Python the dir and file names are reserved for other uses and should not be used as variable names.

# Iterate over the path_to_scan
for root, directories, files in os.walk(path_to_scan):

It is common to create a second for loop, as shown in the following code, to step through each of the files located in that directory and perform some action on them. Using the os.path.join() method, we can join the root and file_entry variables to obtain the file’s path. We then print this file path to the console. We may also, for example, append this file path to a list that we later iterate over to process each of the files:

    # Iterate over the files in the current "root"
    for file_entry in files:
        # create the relative path to the file
        file_path = os.path.join(root, file_entry)
        print(file_path)

We can also use root + os.sep() + file_entry to achieve the same effect, but it is not as Pythonic as the method we're using to join paths. Using os.path.join(), we can pass two or more strings to form a single path, such as directories, subdirectories, and files.

When we run the preceding script with our example input directory, we see the following output:

As seen, the os.walk() method iterates through a directory, then will descend into any discovered sub-directories, thereby scanning the entire directory tree.

There's more…

This script can be further improved. Here's a recommendation:

Check out and implement similar functionality using the glob library which, unlike the os module, allows for wildcard pattern recursive searches for files and directories

Python Digital Forensics Cookbook

By : Chapin Bryce, Preston Miller

Python Digital Forensics Cookbook

By: Chapin Bryce, Preston Miller

Overview of this book

Related Content you might be interested in

Current Title:

Python Digital Forensics Cookbook

Learning Python for Forensics.

Windows Forensics Analyst Field Guide