Book Image

Python Digital Forensics Cookbook

By : Chapin Bryce, Preston Miller
Book Image

Python Digital Forensics Cookbook

By: Chapin Bryce, Preston Miller

Overview of this book

Technology plays an increasingly large role in our daily lives and shows no sign of stopping. Now, more than ever, it is paramount that an investigator develops programming expertise to deal with increasingly large datasets. By leveraging the Python recipes explored throughout this book, we make the complex simple, quickly extracting relevant information from large datasets. You will explore, develop, and deploy Python code and libraries to provide meaningful results that can be immediately applied to your investigations. Throughout the Python Digital Forensics Cookbook, recipes include topics such as working with forensic evidence containers, parsing mobile and desktop operating system artifacts, extracting embedded metadata from documents and executables, and identifying indicators of compromise. You will also learn to integrate scripts with Application Program Interfaces (APIs) such as VirusTotal and PassiveTotal, and tools such as Axiom, Cellebrite, and EnCase. By the end of the book, you will have a sound understanding of Python and how you can use it to process artifacts in your investigations.
Table of Contents (11 chapters)

Copying files, attributes, and timestamps

Recipe Difficulty: Easy

Python Version: 2.7 or 3.5

Operating System: Windows

Preserving files is a fundamental task in digital forensics. It is often preferable to containerize files in a format that can store hashes and other metadata of loose files. However, sometimes we need to copy files in a forensic manner from one location to another. Using this recipe, we will demonstrate some of the methods available to copy files while preserving common metadata fields.

Getting started

This recipe requires the installation of two third-party modules pywin32 and pytz. All other libraries used in this script are present in Python's standard library. This recipe will primarily use two libraries, the built-in shutil and a third-party library, pywin32. The shutil library is our go-to for copying files within Python, and we can use it to preserve most of the timestamps and other file attributes. The shutil module, however, is unable to preserve the creation time of files it copies. Rather, we must rely on the Windows-specific pywin32 library to preserve it. While the pywin32 library is platform specific, it is incredibly useful to interact with the Windows operating system.


To learn more about the shutil library, visit https://docs.python.org/3/library/shutil.html.

To install pywin32, we need to access its SourceForge page at https://sourceforge.net/projects/pywin32/ and download the version that matches our Python installation. To check our Python version, we can import the sys module and call sys.version within an interpreter. Both the version and the architecture are important when selecting the correct pywin32 installer.


To learn more about the sys library, visit https://docs.python.org/3/library/sys.html.


In addition to the installation of the pywin32 library, we need to install pytz, a third-party library used to manage time zones in Python. We can install this library using the pip command:

pip install pytz==2017.2

How to do it…

We perform the following steps to forensically copy files on a Windows system:

  1. Gather source file and destination arguments.
  2. Use shutil to copy and preserve most file metadata.
  3. Manually set timestamp attributes with win32file.

How it works…

Let’s now dive into copying files and preserving their attributes and timestamps. We use some familiar libraries to assist us in the execution of this recipe. Some of the libraries, such as pytz, win32file, and pywintypes are new. Let’s briefly discuss their purpose here. The pytz module allows us to work with time zones more granularly and allows us to initialize dates for the pywin32 library.

To allow us to pass timestamps in the correct format, we must also import pywintypes. Lastly, the win32file library, available through our installation of pywin32, provides various methods and constants for file manipulation in Windows:

from __future__ import print_function
import argparse
from datetime import datetime as dt
import os
import pytz
from pywintypes import Time
import shutil
from win32file import SetFileTime, CreateFile, CloseHandle
from win32file import GENERIC_WRITE, FILE_SHARE_WRITE
from win32file import OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL

__authors__ = ["Chapin Bryce", "Preston Miller"]
__date__ = 20170815
__description__ = "Gather filesystem metadata of provided file"

This recipe's command-line handler takes two positional arguments, source and dest, which represent the source file to copy and the output directory, respectively. This recipe has an optional argument, timezone, which allows the user to specify a time zone.

To prepare the source file, we store the absolute path and split the filename from the rest of the path, which we may need to use later if the destination is a directory. Our last bit of preparation involves reading the timezone input from the user, one of the four common US time zones, and UTC. This allows us to initialize the pytz time zone object for later use in the recipe:

parser = argparse.ArgumentParser(
description=__description__,
epilog="Developed by {} on {}".format(
", ".join(__authors__), __date__)
)
parser.add_argument("source", help="Source file")
parser.add_argument("dest", help="Destination directory or file")
parser.add_argument("--timezone", help="Timezone of the file's timestamp",
choices=['EST5EDT', 'CST6CDT', 'MST7MDT', 'PST8PDT'],
required=True)
args = parser.parse_args()

source = os.path.abspath(args.source)
if os.sep in args.source:
src_file_name = args.source.split(os.sep, 1)[1]
else:
src_file_name = args.source

dest = os.path.abspath(args.dest)
tz = pytz.timezone(args.timezone)

At this point, we can copy the source file to the destination using the shutil.copy2() method. This method accepts either a directory or file as the destination. The major difference between the shutil copy() and copy2() methods is that the copy2() method also preserves file attributes, including the last written time and permissions. This method does not preserve file creation times on Windows, for that we need to leverage the pywin32 bindings.

To that end, we must build the destination path for the file copied by the copy2() call by using the following if statement to join the correct path if the user provided a directory at the command line:

shutil.copy2(source, dest)
if os.path.isdir(dest):
dest_file = os.path.join(dest, src_file_name)
else:
dest_file = dest

Next, we prepare the timestamps for the pywin32 library. We use the os.path.getctime() methods to gather the respective Windows creation times, and convert the integer value into a date using the datetime.fromtimestamp() method. With our datetime object ready, we can make the value time zone-aware by using the specified timezone and providing it to the pywintype.Time() function before printing the timestamps to the console:

created = dt.fromtimestamp(os.path.getctime(source))
created = Time(tz.localize(created))
modified = dt.fromtimestamp(os.path.getmtime(source))
modified = Time(tz.localize(modified))
accessed = dt.fromtimestamp(os.path.getatime(source))
accessed = Time(tz.localize(accessed))

print("Source\n======")
print("Created: {}\nModified: {}\nAccessed: {}".format(
created, modified, accessed))

With the preparation complete, we can open the file with the CreateFile() method and pass the string path, representing the copied file, followed by arguments specified by the Windows API for accessing the file. Details of these arguments and their meanings can be reviewed at https://msdn.microsoft.com/en-us/library/windows/desktop/aa363858(v=vs.85).aspx:

handle = CreateFile(dest_file, GENERIC_WRITE, FILE_SHARE_WRITE,
None, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, None)
SetFileTime(handle, created, accessed, modified)
CloseHandle(handle)

Once we have an open file handle, we can call the SetFileTime() function to update, in order, the file's created, accessed, and modified timestamps. With the destination file's timestamps set, we need to close the file handle using the CloseHandle() method. To confirm to the user that the copying of the file's timestamps was successful, we print the destination file's created, modified, and accessed times:

created = tz.localize(dt.fromtimestamp(os.path.getctime(dest_file)))
modified = tz.localize(dt.fromtimestamp(os.path.getmtime(dest_file)))
accessed = tz.localize(dt.fromtimestamp(os.path.getatime(dest_file)))
print("\nDestination\n===========")
print("Created: {}\nModified: {}\nAccessed: {}".format(
created, modified, accessed))

The script output shows copying a file from the source to the destination with timestamps successfully preserved:

There's more…

This script can be further improved. We have provided a couple of recommendations here:

  • Hash the source and destination files to ensure they were copied successfully. Hashing files are introduced in the hashing files and data streams recipe in the next section.
  • Output a log of the files copied and any exceptions encountered during the copying process.