Book Image

Mastering Python Forensics

Book Image

Mastering Python Forensics

Overview of this book

Digital forensic analysis is the process of examining and extracting data digitally and examining it. Python has the combination of power, expressiveness, and ease of use that makes it an essential complementary tool to the traditional, off-the-shelf digital forensic tools. This book will teach you how to perform forensic analysis and investigations by exploring the capabilities of various Python libraries. The book starts by explaining the building blocks of the Python programming language, especially ctypes in-depth, along with how to automate typical tasks in file system analysis, common correlation tasks to discover anomalies, as well as templates for investigations. Next, we’ll show you cryptographic algorithms that can be used during forensic investigations to check for known files or to compare suspicious files with online services such as VirusTotal or Mobile-Sandbox. Moving on, you’ll learn how to sniff on the network, generate and analyze network flows, and perform log correlation with the help of Python scripts and tools. You’ll get to know about the concepts of virtualization and how virtualization influences IT forensics, and you’ll discover how to perform forensic analysis of a jailbroken/rooted mobile device that is based on iOS or Android. Finally, the book teaches you how to analyze volatile memory and search for known malware samples based on YARA rules.
Table of Contents (14 chapters)

Introduction to Python ctypes


According to the official Python documentation, ctypes is a foreign function library that provides C compatible data types and allows calling functions in DLLs or shared libraries. A foreign function library means that the Python code can call C functions using only Python, without requiring special or custom-made extensions.

This module is one of the most powerful libraries available to the Python developer. The ctypes library enables you to not only call functions in dynamically linked libraries (as described earlier), but can also be used for low-level memory manipulation. It is important that you understand the basics of how to use the ctypes library as it will be used for many examples and real-world cases throughout the book.

In the following sections, we will introduce some basic features of Python ctypes and how to use them.

Working with Dynamic Link Libraries

Python ctypes export the cdll and on Windows windll or respectively oledll objects, to load the requested dynamic link libraries. A dynamically linked library is a compiled binary that is linked at runtime to the executable main process. On Windows platforms, these binaries are called Dynamic Link Libraries (DLL) and on Linux, they are called shared objects (SO). You can load these linked libraries by accessing them as the attributes of the cdll, windll or oledll objects. Now, we will demonstrate a very brief example for Windows and Linux to get the current time directly out of the time function in libc (this library defines the system calls and other basic facilities such as open, printf, or exit).

Note that in the case of Windows, msvcrt is the MS standard C library containing most of the standard C functions and uses the cdecl calling convention (on Linux systems, the similar library would be libc.so.6):

C:\Users\Admin>python

>>> from ctypes import *
>>> libc = cdll.msvcrt
>>> print libc.time(None)
1428180920

Windows appends the usual .dll file suffix automatically. On Linux, it is required to specify the filename, including the extension, to load the chosen library. Either the LoadLibrary() method of the DLL loaders should be used or you should load the library by creating an instance of CDLL by calling the constructor, as shown in the following code:

(labenv)user@lab:~$ python

>>> from ctypes import *
>>> libc = CDLL("libc.so.6")
>>> print libc.time(None)
1428180920

As shown in these two examples, it is very easy to be able to call to a dynamic library and use a function that is exported. You will be using this technique many times throughout the book, so it is important that you understand how it works.

C data types

When looking at the two examples from the earlier section in detail, you can see that we use None as one of the parameters for a dynamically linked C library. This is possible because None, integers, longs, byte strings, and unicode strings are the native Python objects that can be directly used as the parameters in these function calls. None is passed as a C, NULL pointer, byte strings, and unicode strings are passed as pointers to the memory block that contains their data (char * or wchar_t *). Python integers and Python longs are passed as the platform's default C int type, their value is masked to fit into the C type. A complete overview of the Python types and their corresponding ctype types can be seen in Table 1:

ctypes type

C type

Python type

c_bool (https://docs.python.org/2/library/ctypes.html#ctypes.c_bool)

_Bool

bool (1)

c_char (https://docs.python.org/2/library/ctypes.html#ctypes.c_char)

char

1-character string

c_wchar (https://docs.python.org/2/library/ctypes.html#ctypes.c_wchar)

wchar_t

1-character unicode string

c_byte (https://docs.python.org/2/library/ctypes.html#ctypes.c_byte)

char

int/long

c_ubyte (https://docs.python.org/2/library/ctypes.html#ctypes.c_ubyte)

unsigned char

int/long

c_short (https://docs.python.org/2/library/ctypes.html#ctypes.c_short)

short

int/long

c_ushort (https://docs.python.org/2/library/ctypes.html#ctypes.c_ushort)

unsigned short

int/long

c_int (https://docs.python.org/2/library/ctypes.html#ctypes.c_int)

int

int/long

c_uint (https://docs.python.org/2/library/ctypes.html#ctypes.c_uint)

unsigned int

int/long

c_long (https://docs.python.org/2/library/ctypes.html#ctypes.c_long)

long

int/long

c_ulong (https://docs.python.org/2/library/ctypes.html#ctypes.c_ulong)

unsigned long

int/long

c_longlong (https://docs.python.org/2/library/ctypes.html#ctypes.c_longlong)

__int64 or long long

int/long

c_ulonglong (https://docs.python.org/2/library/ctypes.html#ctypes.c_ulonglong)

unsigned __int64 or unsigned long long

int/long

c_float (https://docs.python.org/2/library/ctypes.html#ctypes.c_float)

float

float

c_double (https://docs.python.org/2/library/ctypes.html#ctypes.c_double)

double

float

c_longdouble (https://docs.python.org/2/library/ctypes.html#ctypes.c_longdouble)

long double

float

c_char_p (https://docs.python.org/2/library/ctypes.html#ctypes.c_char_p)

char * (NUL terminated)

string or None

c_wchar_p (https://docs.python.org/2/library/ctypes.html#ctypes.c_wchar_p)

wchar_t * (NUL terminated)

unicode or None

c_void_p (https://docs.python.org/2/library/ctypes.html#ctypes.c_void_p)

void *

int/long or None

Table 1: Fundamental Data Types

This table is very helpful because all the Python types except integers, strings, and unicode strings have to be wrapped in their corresponding ctypes type so that they can be converted to the required C data type in the linked library and not throw the TypeError exceptions, as shown in the following code:

(labenv)user@lab:~$ python

>>> from ctypes import *
>>> libc = CDLL("libc.so.6")
>>> printf = libc.printf

>>> printf("An int %d, a double %f\n", 4711, 47.11)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ctypes.ArgumentError: argument 3: <type 'exceptions.TypeError'>: Don't know how to convert parameter 3

>>> printf("An int %d, a double %f\n", 4711, c_double(47.11))
An int 4711, a double 47.110000

Defining Unions and Structures

Unions and Structures are important data types because they are frequently used throughout the libc on Linux and also in the Microsoft Win32 API.

Unions are simply a group of variables, which can be of the same or different data types, where all of its members share the same memory location. By storing variables in this way, unions allow you to specify the same value in different types. For the upcoming example, we will change from the interactive Python shell to the atom editor on our Ubuntu lab environment. You just need to open atom editor, type in the following code, and save it under the name new_evidence.py:

from ctypes import *

class case(Union):
        _fields_ = [
        ("evidence_int", c_int),
        ("evidence_long", c_long),
        ("evidence_char", c_char * 4)
        ]

value = raw_input("Enter new evidence number:")
new_evidence = case(int(value))
print "Evidence number as a int: %i" % new_evidence.evidence_int
print "Evidence number as a long: %ld" % new_evidence.evidence_long
print "Evidence number as a char: %s" % new_evidence.evidence_char

If you assign the evidence union's member variable evidence_int a value of 42, you can then use the evidence_char member to display the character representation of that number, as shown in the following example:

(labenv)user@lab:~$ python new_evidence.py

Enter new evidence number:42

Evidence number as a long: 42
Evidence number as a int: 42
Evidence number as a char: *

As you can see in the preceding example, by assigning the union a single value, you get three different representations of that value. For int and long, the displayed output is obvious but for the evidence_char variable, it could be a bit confusing. In this case, '*' is the ASCII character with the value of the equivalent of decimal 42. The evidence_char member variable is a good example of how to define an array in ctypes. In ctypes, an array is defined by multiplying a type by the number of elements that you want to allocate in the array. In this example, a four-element character array was defined for the member variable evidence_char.

A structure is very similar to unions, but the members do not share the same memory location. You can access any of the member variables in the structure using dot notation, such as case.name. This would access the name variable contained in the case structure. The following is a very brief example of how to create a structure (or struct, as they are often called) with three members: name, number, and investigator_name so that all can be accessed by the dot notation:

from ctypes import *

class case(Structure):
        _fields_ = [
        ("name", c_char * 16),
        ("number", c_int),
        ("investigator_name", c_char * 8)
        ]

Tip

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.