Book Image

Python Digital Forensics Cookbook

By : Chapin Bryce, Preston Miller
Book Image

Python Digital Forensics Cookbook

By: Chapin Bryce, Preston Miller

Overview of this book

Technology plays an increasingly large role in our daily lives and shows no sign of stopping. Now, more than ever, it is paramount that an investigator develops programming expertise to deal with increasingly large datasets. By leveraging the Python recipes explored throughout this book, we make the complex simple, quickly extracting relevant information from large datasets. You will explore, develop, and deploy Python code and libraries to provide meaningful results that can be immediately applied to your investigations. Throughout the Python Digital Forensics Cookbook, recipes include topics such as working with forensic evidence containers, parsing mobile and desktop operating system artifacts, extracting embedded metadata from documents and executables, and identifying indicators of compromise. You will also learn to integrate scripts with Application Program Interfaces (APIs) such as VirusTotal and PassiveTotal, and tools such as Axiom, Cellebrite, and EnCase. By the end of the book, you will have a sound understanding of Python and how you can use it to process artifacts in your investigations.
Table of Contents (11 chapters)

Keeping track with a progress bar

Recipe Difficulty: Easy

Python Version: 2.7 or 3.5

Operating System: Any

Long-running scripts are unfortunately commonplace when processing data measured in gigabytes or terabytes. While your script may be processing this data smoothly, a user may think it's frozen after three hours with no indication of progress. Luckily, several developers have built an incredibly simple progress bar library, giving us little excuse for not incorporating this into our code.

Getting started

This recipe requires the installation of the third-party module tqdm. All other libraries used in this script are present in Python's standard library. The tqdm library, pronounced taqadum, can be installed via pip or downloaded from GitHub at https://github.com/tqdm/tqdm. To use all of the features shown in this recipe, ensure you are using release 4.11.2, available on the tqdm GitHub page or with pip using the following command:

pip install tqdm==4.11.2

How to do it…

To create a simple progress bar, we follow these steps:

  1. Import tqdm and time.
  2. Create multiple examples with tqdm and loops.

How it works…

As with all other recipes, we begin with the imports. While we only need the tqdm import to enable the progress bars, we will use the time module to slow down our script to better visualize the progress bar. We use a list of fruits as our sample data and identify which fruits containing "berry" or "berries" in their name:

from __future__ import print_function
from time import sleep
import tqdm

fruits = [
"Acai", "Apple", "Apricots", "Avocado", "Banana", "Blackberry",
"Blueberries", "Cherries", "Coconut", "Cranberry", "Cucumber",
"Durian", "Fig", "Grapefruit", "Grapes", "Kiwi", "Lemon", "Lime",
"Mango", "Melon", "Orange", "Papaya", "Peach", "Pear", "Pineapple",
"Pomegranate", "Raspberries", "Strawberries", "Watermelon"
]

The following for loop is very straightforward and iterates through our list of fruits, checking for the substring berr is within the fruit's name before sleeping for one-tenth of a second. By wrapping the tqdm() method around the iterator, we automatically have a nice-looking progress bar giving us the percentage complete, elapsed time, remaining time, the number of iterations complete, and total iterations.

These display options are the defaults for tqdm and gather all of the necessary information using properties of our list object. For example, the library knows almost all of these details for the progress bar just by gathering the length and calculating the rest based on the amount of time per iteration and the number elapsed:

contains_berry = 0
for fruit in tqdm.tqdm(fruits):
if "berr" in fruit.lower():
contains_berry += 1
sleep(.1)
print("{} fruit names contain 'berry' or 'berries'".format(contains_berry))

Extending the progress bar beyond the default configuration is as easy as specifying keyword arguments. The progress bar object can also be created prior to the start of the loop and using the list object, fruits, as the iterable argument. The following code exhibits how we can define our progress bar with our list, a description, and providing the unit name.

If we were not using a list but another iterator type that does not have a __len__ attribute defined, we would need to manually supply a total with the total keyword. Only basic statistics about elapsed time and iterations per second display if the total number of iterations is unavailable.

Once we are in the loop, we can display the number of results discovered using the set_postfix() method. Each iteration will provide an update of the number of hits we have found to the right of the progress bar:

contains_berry = 0
pbar = tqdm.tqdm(fruits, desc="Reviewing names", unit="fruits")
for fruit in pbar:
if "berr" in fruit.lower():
contains_berry += 1
pbar.set_postfix(hits=contains_berry)
sleep(.1)
print("{} fruit names contain 'berry' or 'berries'".format(contains_berry))

One other common use case for progress bars is to measure execution in a range of integers. Since this is a common use of the library the developers built a range call into the library, called trange(). Notice how we can specify the same arguments here as before. One new argument that we will use here, due to the larger numbers, is the unit_scale argument, which simplifies large numbers into a small number with a letter to designate the magnitude:

for i in tqdm.trange(10000000, unit_scale=True, desc="Trange: "):
pass

When we execute the code, the following output is visible. Our first progress bar displays the default format, while the second and third show the customizations we have added:

There's more…

This script can be further improved. Here's a recommendation:

  • Further explore the capabilities the tqdm library affords developers. Consider using the tqdm.write() method to print status messages without breaking the progress bar.