Book Image

Learning Python for Forensics - Second Edition

By : Preston Miller, Chapin Bryce
Book Image

Learning Python for Forensics - Second Edition

By: Preston Miller, Chapin Bryce

Overview of this book

Digital forensics plays an integral role in solving complex cybercrimes and helping organizations make sense of cybersecurity incidents. This second edition of Learning Python for Forensics illustrates how Python can be used to support these digital investigations and permits the examiner to automate the parsing of forensic artifacts to spend more time examining actionable data. The second edition of Learning Python for Forensics will illustrate how to develop Python scripts using an iterative design. Further, it demonstrates how to leverage the various built-in and community-sourced forensics scripts and libraries available for Python today. This book will help strengthen your analysis skills and efficiency as you creatively solve real-world problems through instruction-based tutorials. By the end of this book, you will build a collection of Python scripts capable of investigating an array of forensic artifacts and master the skills of extracting metadata and parsing complex data structures into actionable reports. Most importantly, you will have developed a foundation upon which to build as you continue to learn Python and enhance your efficacy as an investigator.
Table of Contents (15 chapters)

Understanding scripting flow logic

Flow control logic allows us to create dynamic operations by specifying different routes of program execution based upon a series of circumstances. In any script worth its salt, some manner of flow control is present. For example, flow logic would be required to create a dynamic script that returns different results based on options selected by the user. In Python, there are two basic sets of flow logic: conditionals and loops.

Flow operators are frequently accompanied by flow logic. These operators can be strung together to create more complicated logic. The following table represents a truth table and illustrates the value of various flow operators based on the A or B variable Boolean state:

A

B

A and B

A or B

not A

not B

F

F

F

F

T

T

T

F

F

T

F

T

F

T

F

T

T

F

T

T

T

T

F

F

The logical AND and OR operators are the third and fourth columns in the table. Both A and B must be True for the AND operator to return True. Only one of the variables needs to be True for the OR operator to be True. The not operator simply switches the Boolean value of the variable to its opposite (for example, True becomes False and vice versa).

Mastering conditionals and loops will take our scripts to another level. At its core, flow logic relies on only two values, True or False. As noted earlier, in Python, these are represented by the Boolean True and False data types.

Conditionals

When a script hits a conditional, it's much like standing at a fork in the road. Depending on some factor, say a more promising horizon, you may decide to go east over west. Computer logic is less arbitrary in that if something is true the script proceeds one way, and if it is false then it will go another. These junctions are critical; if the program decides to go off the path we've developed for it, we'll be in serious trouble.

There are three statements that are used to form a conditional block: if, elif, and else. The conditional block refers to the conditional statements, their flow logic, and code. A conditional block starts with an if statement followed by flow logic, a colon, and indented line(s) of code. If the flow logic evaluates to True, then the indented code following the if statement will be executed. If it does not evaluate to True, the Python virtual machine (PVM) will skip those lines of code and go to the next line on the same level of indentation as the if statement. This is usually a corresponding elif (else-if) or else statement.

Indentation is very important in Python. It is used to demarcate code to be executed within a conditional statement or loop. A standard of four spaces for indentation is used in this book, though you may encounter code that uses a two-space indentation or uses tab characters. While all three of these practices are allowed in Python, four spaces are preferred and easier to read.

In a conditional block, once one of the statements evaluates to True, the code is executed and the PVM exits the block without evaluating the other statements.

# Conditional Block Pseudocode
if [logic]:
# Line(s) of indented code to execute if logic evaluates to True.
elif [logic]:
# Line(s) of indented code to execute if the 'if'
# statement is false and this logic is True.
else:
# Line(s) of code to catch all other possibilities if
# the 'if' and 'elif' statements are all False.

Until we define functions, we will stick to simple if statement examples:

>>> a = 5
>>> b = 22
>>> a > 0
True
>>> a > b
False
>>> if a > 0:
... print(str(a) + ' is greater than zero!')
...
5 is greater than zero!
>>> if a >= b:
... print(str(a) + ' beats ' + str(b))
...
>>>

Notice how when the flow logic evaluates to True, then the code indented following the if statement is executed. When it evaluates to False, the code is skipped. Typically, when the if statement is false, you will have a secondary statement, such as an elif or else to catch other possibilities, such as when a is less than or equal to b. However, it is important to note that we can just use an if statement without any elif or else statements.

The difference between if and elif is subtle. We can only functionally notice a difference when we use multiple if statements. The elif statement allows for a second condition to be evaluated in the case that the first isn't successful. A second if statement will be evaluated regardless of the outcome of the first if statement.

The else statement does not require any flow logic and can be treated as a catch-all case for any remaining or unaccounted for case. This does not mean, however, errors will not occur when the code in the else statement is executed. Do not rely on else statements to handle errors.

Conditional statements can be made more comprehensive by using the logical and or or operators. These allow for more complex logic in a single conditional statement:

>>> a = 5
>>> b = 22
>>> if a > 4 and a < b:
... print('Both statements must be true to print this')
...
Both statements must be true to print this
>>> if a > 10 or a < b:
... print('One of these statements must be true to print this')
...
Only one of these statements must be true to print this

The following table can be helpful to understand how common operators work:

Operator

Description

Example

Evaluation

<, >

less than, greater than

8 < 3

False

<=, >=

less than equal to, greater than equal to

5 =< 5

True

==, !=

equal to, not equal to

2 != 3

True

not

switches Boolean value

not True

False

Loops

Loops provide another method of flow control, and are suited to perform iterative tasks. A loop will repeat inclusive code until the provided condition is no longer True or an exit signal is provided. There are two kinds of loops: for and while. For most iterative tasks, a for loop will be the best option to use.

The for loop

for loops are the most common and, in most cases, the preferred method to perform a task over and over again. Imagine a factory line; for each object on the conveyor belt, a for loop could be used to perform some task on it, such as placing a label on the object. In this manner, multiple for loops can come together in the form of an assembly line, processing each object, until they are ready to be presented to the user.

Much like the rest of Python, the for loop is very simple syntactically, yet powerful. In some languages, a for loop needs to be initialized, have a counter of sorts, and a termination case. Python's for loop is much more dynamic and handles these tasks on its own. These loops contain indented code that is executed line by line. If the object being iterated over still has elements (for example, more items to process) at the end of the indented block, the PVM will position itself at the beginning of the loop and repeat the code again.

The for loop syntax will specify the object to iterate over and what to call each of the elements within the object. Note that the object must be iterable. For example, lists, sets, tuples, and strings are iterable, but an integer is not. In the following example, we can see how a for loop treats strings and lists and helps us iterate over each element in iterable objects:

>>> for character in 'Python':
... print(character)
...
P
y
t
h
o
n
>>> cars = ['Volkswagon', 'Audi', 'BMW']
>>> for car in cars:
... print(car)
...
Volkswagon
Audi
BMW

There are additional, more advanced, ways to call a for loop. The enumerate() function can be used to start an index. This comes in handy when you need to keep track of the index of the current loop. Indexes are incremented at the beginning of the loop. The first object has an index of 0, the second has an index of 1, and so on. The range() function can execute a loop a certain number of times and provide an index:

>>> numbers = [5, 25, 35]
>>> for i, x in enumerate(numbers):
... print('Item', i, 'from the list is:', x)
...
Item 0 from the list is: 5
Item 1 from the list is: 25
Item 2 from the list is: 35
>>> for x in range(0, 100):
... print(x)
0
1
# continues to print 0 to 100 (omitted in an effort to save trees)

The while loop

while loops are not encountered as frequently in Python. A while loop executes as long as a statement is true. The simplest while loop would be a while True statement. This kind of loop would execute forever since the Boolean object True is always True and so the indented code would continually execute.

If you are not careful, you can inadvertently create an infinite loop, which will wreak havoc on your script's intended functionality. It is imperative to utilize conditionals to cover all your bases such as if, elif, and else statements. If you fail to do so, your script can enter an unaccounted situation and crash. This is not to say that while loops are not worth using. while loops are quite powerful and have their own place in Python:

>>> guess = 0
>>> answer = 42
>>> while True:
... if guess == answer:
... print('You've found the answer to this loop: ' + str(answer))
... break
... else:
... print(guess, 'is not the answer.')
... guess += 1

The break, continue, and pass statements are used in conjunction with for and while loops to create more dynamic loops. The break escapes from the current loop, while the continue statement causes the PVM to begin executing code at the beginning of the loop, skipping any indented code following the continue statement. The pass statement literally does nothing and acts as a placeholder. If you're feeling brave or bored, or worse, both, remove the break statement from the previous example and note what happens.