Python data types

Like any other programming language, Python also comes with standard data types. In this section, we will explore the various powerful data types that Python makes available for us to use.

Numbers

Numbers, as the name suggests, covers all the numeric data types, including both integer and floating data types. Earlier in this chapter, we saw that to use an integer or a float, we can simply declare the variable and assign an integer or a float value. Now, let's write a proper Python script and explore how to use numbers. Name the script numbers.py which is shown as follows:

The preceding screenshot show a simple Python script that adds an integer with a float and then prints the sum. To run the script, we can type the python3 numbers.py command, as follows:

You might have noticed that the command at the beginning of the script says #! /usr/bin/python. What this line does is make your code executable. After the privileges of the script have changed and it has been made executable, the command says that if an attempt is made to execute this script, then we should go ahead and execute it with python3, which is placed in the /usr/bin/python3 path. This can be seen in the following example:

If we observe the print command, we can see that the string formatter is %s. To fill it in with the actual value, the second argument to the print function is passed:

To convert a string into its equivalent integer or float value, we can use the built-in int() and float() functions.

String types

We know that a string is a collection of characters. In Python, string types come under the sequence category. Strings are really powerful and have many methods that can be used to perform string manipulation operations. Let's look at the following piece of code, which introduces us to strings in Python. Strings can be declared within both single and double quotes in Python:

In the preceding code, we are simply declaring a string called my_str and printing it on the console window.

String indexes

It must be noted that strings can be accessed as a sequence of characters in Python. Strings can be thought of as a list of characters. Let's try to print the characters at various indices of the string, as shown in the following screenshot:

At index 0, the character 0 gets printed. At index 10, we have an empty space, while at index 5, we have the letter m. It should be noted that the sequences are stored in Python with a starting index of 0, and the same holds true for the string type.

String operations through methods and built-in functions

In this section, we will look at how to compare two strings, concatenate strings, copy one string to another, and perform various string manipulation operations with the help of some methods.

The replace( ) method

The replace method is used to perform string replacement. It returns a new string with the appropriate replacements. The first argument to the replace method is the string or character to be replaced within the string, while the second argument is the string or character with which it is to be replaced:

In the preceding example, we can see that the ! from the original string is replaced by @ and a new string with the replacement is returned. It should be noted that these changes were not actually made to the original string, but instead a new string was returned with the appropriate changes. This can be verified in the following line, where we print the original string and the old unchanged value, Welcome to python strings !, is printed. The reason behind this is that strings in Python are immutable, just like they are in Java. This means that once a string is declared, it can't usually be modified. This isn't always the case, however. Let's try to change the string and this time try and catch the modifications in the originally declared string, my_str, as follows:

In the preceding code, we were able to modify the original string, as we got the newly returned string from the replace method in our earlier declared string, my_str. This might sound contradictory to what we said previously. Let's take a look at how this works by looking at what happens behind the scenes before and after we call the replace method:

After replacing the ! with @, this will look as follows:

It can be seen in the preceding two illustrations that before the replace method was called, the my_str string reference pointed toward the actual object that contained an !. Once the replace() method returned a new string and we updated the existing string variable with the newly returned object, the older memory object was not overwritten, but instead a new one was created. The program reference now points toward the newly created object. The earlier object is in memory and doesn't have any references pointing toward it. This will be cleaned up by the garbage collector at a later stage.

Another thing we can do is try and change any character in any position of the original string. We have already seen that the string characters can be accessed by their index, but if we try to update or change a character at any specific index, an exception will be thrown and the operation will not be permitted, as shown in the following screenshot:

By default, the replace() method replaces all the occurrences of the replacement string within the target string. If we only want to replace one or two occurrences of something within the target string, however, we can pass a third argument to the replace() method and specify the number of replacement occurrences that we want to have. Let's say we have the following string:

If we just want the first occurrence of the ! character to be @ and we want the rest to be the same, this can be achieved as follows:

Substrings or string slicing

Obtaining part of the string is a common exercise that we come across frequently in day-to-day string operations. Languages such as C or Java provide us with dedicated methods such as substr(st_index,end_index) or subString(st_index,end_index). To perform the substring operation in Python, there is no dedicated method, but we can instead use slicing. For example, if we wish to get the first four characters of our original my_str string, we can achieve this by using operations such as my_str[0:4], as shown in the following screenshot:

Again, the slice operation returns a new string and the changes are not applied to the original string. Furthermore, it is worth understanding here that the slicing happens over n-1 characters, where n is the upper limit, specified as the second parameter, which is four, in our case. Thus, the actual substring operation will be performed starting from index 0 and ending at index 3, thus returning the string Welc.

Let's take a look at some more examples of slicing:

To get the whole string from index 4, do the following:

To get the string from the start up to index 4, do the following:

To print the whole string with slicing, do the following:

To print the characters with a step of 2, do the following:

To print the reverse of the string, do the following:

To print a part of the string in reverse order, to the following:

String concatenation and replication

+ is the concatenation operator that's used in Python to concatenate two strings. As always, the result of the concatenation is a new string and unless we get the updated string, the update will not be reflected with the original string object. The + operator is internally overloaded to perform concatenation of objects when it is used on string types. It is also used for the addition of two numbers when used on numeric data types, like so:

Interestingly, Python also supports another operator that gets overloaded when used with string data types. Instead of performing a conventional operation, this operator performs a variation of the original operation so that the functionality can be replicated across string data types. Here, we are talking about the multiplication operator, *. This is conventionally supposed to perform the multiplication of numeric data types, but when it is used on string data types, it performs a replication operation instead. This is shown in the following code snippet:

In the preceding case, the multiplication operator actually replicates the Hello world string stored in the c variable five times, as we specified in the expression. This is a very handy operation and can be used to generate fuzzing payloads, which we will see in the later chapters of this book.

The strip(), lstrip(), and rstrip() methods

The strip method is actually used to strip off the white spaces from the input string. By default, the strip method will strip off the spaces from both the left and right sides of the string and will return a new string without spaces on both the leading and trailing sides, as shown in the following screenshot:

However, if we only wish to strip off the left spaces ,we can use the lstrip() method. Similarly, if we just wish to strip off the right spaces, we can use the rstrip() method. This is shown as follows:

The split() method

The split method, as the name suggests, is used to split the input string over a particular delimiter and return a list that contains the words that have been split. We will be looking at lists in more detail shortly. For now, let's take a look at the following example, where we have the name, the age, and the salary of an employee in a string separated by commas. If we wish to obtain this information separately, we can perform a split over ,. The split function takes the first argument as the delimiter on which the split operation is to be performed:

By default, the split operation is performed over a space, that is, if a delimiter is not specified. This can be seen as follows:

The find(), index(), upper(), lower(), len(), and count() methods

The find() function is used to search for a character or string within our target string. This function returns the first index of the string if a match is found. It returns -1 if it does not find the match:

The index() method is identical to the find() method. It returns the first index of the string if it finds the match and raises an exception if it does not find a match:

The upper() method is used to transform the input string to upper case letters and the lower() method is used to transform a given string to lowercase letters:

The len() method returns the length of the given string:

The count() method returns the number of occurrences of any character or string that we wish to count within the target string:

The in and not in methods

The in and not in methods are very handy, as they let us perform a quick search on the sequences. If we wish to check if a certain character or word is present or not present in the target string, we can use the in and not in methods. These will return True if the word is present and False otherwise:

The endswith(), isdigit(), isalpha(), islower(), isupper(), and capitalize() methods

The endswith() method checks whether the given string ends with a specific character or word that we pass as an argument:

The isdigit() method checks whether the given string is of a digit type or not:

The isalpha() method checks whether the given string is of an alphabetic character type or not:

The islower() method checks whether the string is lowercase, while the isupper() method checks if the string is uppercase. The capitalize() method puts a given string into sentence case:

List types

Python does not have array types, but instead offers the list data type. Python lists also fall under the category of sequences and offer a wide range of functionalities. Coming from a Java, C, or C++ background, you are likely to find that Python lists are slightly different from the arrays and list types offered by these languages. In C, C++, or Java, an array is a collection of elements of similar data types, and this is also the case for Java array lists. This is different in the case of Python. In Python, a list is a collection of elements that can be of either homogeneous and heterogeneous data types. This is one of the features that makes Python lists powerful, robust, and easy to use. We also don't need to specify the size of a Python list when declaring it. It can grow dynamically to match the number of elements it contains. Let's see a basic example of using lists:

Lists in Python start from index 0 and any item can be accessed on the basis of indices, as shown in the preceding screenshot. The preceding list is homogeneous, as all the elements are of string type. We can also have a heterogeneous list, as follows:

For now, we are printing the list elements manually. We can very easily iterate over them with loops instead, and we will explore that later on. For now, let's try to understand which operations can be performed on list structures in Python.

Slicing the lists

Slicing is an operation that allows us to extract elements from sequences and lists. We can slice lists to extract portions that we might be interested in. It must be noted again that the indexes of slicing are 0-based and that the last index is always considered to be n-1, where n is the specified last index value. To slice the first five and last five elements from the list, we can perform the following operation:

Let's see some examples of list slicing and their results:

To get the list from index 4 onwards, do the following:

To get the list elements from the start up to index 4, do the following:

To print the whole list with slicing, do the following:

To print the list elements with a step size of 2, do the following:

To print the reverse of the list, do the following:

To print a portion of the list in reverse order, do the following:

Add new elements to list-append(): The append() method is used to add elements to the list, and the elements to be added are given as an argument to the append() method. These elements to be added can be of any type. As well as being a number or a string, the element can be a list in itself:

We can see in the preceding example that we added three elements, 6, 7, and 8, to our original list using the append() method. Then, we actually added another list containing three characters that would be stored intact as a list inside the original list. These can be accessed by specifying the my_list[8] index. In the preceding example, the new list is added intact to the original list, but is not merged.

Merging and updating lists

List merging can be done in two ways in Python. First, we can use the traditional + operator, which we used previously to concatenate two strings. It does the same when used on list object types. The other way to achieve this would be by using the extend method, which takes the new list as an argument to be merged with the existing list. This is shown in the following example:

To update an element in the list, we can access its index and add the updated value for any element that we wish to update. For example, if we want to have the string Hello as the 0^th element of the list, this can be achieved by assigning the 0^th element to the Hello value as merged[0]="hello":

Copying lists

We have seen that Python variables are nothing but references to actual objects. The same holds true for lists. For this reason, manipulating lists gets a little tricky. By default, if we copy one list variable to another one by simply using the = operator, it won't actually create a duplicate or local copy of the list for that variable – instead, it would just create another reference and point the newly created reference toward the same memory location. Thus, when we make a change to the copied variable, the same change will be reflected in the original list. In the following example, we will create new isolated copies, where a change in the copied variable will not be reflected in the original list:

Now, let's look at how can we create a new copy of an existing list so that the changes to the new one do not cause any changes to the existing one:

Another way to create the isolated copy of the original list is to make use of the copy and deepcopy functions that are available in Python. A shallow copy constructs a new object and then inserts references to that object to the objects found in the original list. A deep copy, on the other hand, constructs a new compound object and then recursively inserts copies of the objects found in the original list:

Removing elements from lists

We can use the del command to delete either an element from the list or the whole list. The del command does not return anything. We can also use the pop method to remove elements from the list. The pop method takes the index of the element that we wish to remove as an argument:

The entire list structure can be deleted as follows:

Replication with len(), max(), and min()

The multiplication operator *, when applied to lists, causes a replication effect of the list elements. The contents of the list are repeated as many times as indicated by the number passed to the replication operator:

The len() method gives the length of the Python lists. The max() method returns the maximum element of the list, while the min() method returns the minimum element of the list:

We can use the max and min methods on the character types as well, but we cannot use them on a list that has mixed or heterogeneous types. If we do this, we will get an exception stating that we are trying to compare numbers and characters:

in and not in

The in and not in methods are essential Python operations that can be used against any sequence type. We saw how these were used previously with strings, where we used them to search for a string or character within the target string. The in method returns true if the search is successful and returns false if not. The opposite is the case for the not in method. The execution is shown as follows:

Tuples in Python

A Python tuple is very similar to a Python list. The difference is that it's a read-only structure, so once it is declared, no modification can be made to the elements of the tuple. Python tuples can be used as follows:

In the preceding code, we can see that we can access tuples in the same way as we can access lists, but when we try to change any element of the tuple, it throws us an exception as a tuple is a read-only structure. If we perform the operations that we performed on lists, we will see that they work in exactly the same way as tuples:

If a tuple has only one element in it, it has to be declared with a trailing comma. If we do not add that comma while declaring it, it will be interpreted as a numeric or string data type, depending on the elements of the tuple. The following example explains this better:

A tuple can be converted into a list and can then be operated on as follows:

Dictionaries in Python

Dictionaries are very powerful structures and are widely used in Python. A dictionary is a key-value pair structure. A dictionary key can be a unique number or string, and the value can be any Python object. Dictionaries are mutable and can be changed in place. The following example demonstrates the basics of dictionaries in Python:

A Python dictionary can be declared within curly braces. Each key value pair is separated by a comma. It should be noted that the keys have to be unique; if we try to repeat the keys, the old key value pair is overwritten by the new one. From the preceding example, we can establish that the dictionary keys can be either string or numeric types. Let's try to perform various operations on dictionaries in Python:

Retrieving the dictionary values with the keys: Dictionary values can be accessed through the name of the dictionary key. If the name of the key is not known, we can use loops to iterate through the whole dictionary structure. We will cover this in the next chapter of this book:

This is one of the many ways to print dictionary values. However, if the key for which the value we wish to print does not exist in the dictionary, we will get a key not found exception, as shown in the following screenshot:

There is a better way to handle this and avoid these kinds of exceptions. We can use the get() method provided by the dictionary class. The get() method takes the key name as the first argument and the default value if the key is not present as the second argument. Then, instead of throwing an exception, the default value will be returned if the key is not found. This is shown in the following screenshot:

In the preceding example, when the k1 key is present in the actual dictionary, dict1, the value for the k1 key is returned, which is v1. Then, the k0 key was searched, which was not present originally. In that case, no exception was raised, but instead the False value was returned, suggesting that no such key, K0, was actually present. Remember that we can specify any placeholder as the second argument to the get() method to indicate the absence of the key we are searching for.

Adding keys and values to the dictionary: Once a dictionary has been declared, over the course of the code, there could be many occasions in which we want to modify a dictionary key or add a new dictionary key and value. This can be achieved as follows. As mentioned earlier, a dictionary value can be any Python object, so we can have tuples, lists, and dictionary types as values inside a dictionary:

Now, let's add more complex types as values:

These values can be retrieved as normal values by their keys as follows:

Expanding a dictionary with the contents of another dictionary: In the preceding example,we added a dictionary as a value to an existing dictionary. We will now see how can we merge two dictionaries into one common or new dictionary. The update() method can be used to do this:

Keys(): To get all the dictionary keys, we can use the keys() method. This returns the class instances of the dictionary keys:

We can see that the keys method returns an instance of the dict_keys class, which holds the list of dictionary keys. We can type cast this as a list type as follows:

values(): The values() method returns all the values that are present in the dictionary:

Items(): This method is actually used to iterate over the dictionary key value pairs, as it returns a list class instance that contains a list of tuples. Each tuple has two entries, the first one being the key and the second one being the value:

We can convert the returned class instance into a tuple, list tuple, or list type as well. The ideal way to do this is to iterate over the items, which we will see later when we look at loops:

in and not in: The in and not in methods are used to see whether a key is present in the dictionary or not. By default, the in and not in clauses will search the dictionary keys, not the values. Take a look at the following example:

Order of storing: By default, Python dictionaries are unordered, which means they are not stored internally in the same order as we define them. The reason for this is that the dictionaries are stored in dynamic tables called hash tables. As these tables are dynamic, they can increase and shrink in size. What happens internally is that a hash value of the key is computed and stored in the table. The key goes in the first column, while the second column holds the actual value. Let's take a look at the following example to explain this better:

In the preceding case, we declare a dictionary, a, with the first key as abc and the second key as abcd. When we print the values, however, we can see that abcd is stored internally before abc. To explain this, let's assume that the dynamic table or hash table in which the dictionary is internally stored is of size 8.

As we mentioned earlier, the keys will be stored as hash values. When we compute the hash of the abc string and and divide it in a modular fashion by 8, which is the table size, we get the result of 7. If we do the same for abcd, we get a result of 4. This means that the hash abcd will be stored at index 4, while the hash abc will be stored at index 7. For this reason, in the listing, we get abcd listed before abc:

There may be occasions in which two keys arrive at a common value after the hash(key)%table_size operation, which is called a collision. In this case, the key to be slotted first is the one that is stored first.

sorted(): If we want our dictionary to be sorted according to the keys, we can use the built-in sorted method. This can be tweaked to return a list of tuples, with each tuple having a key at the 0^th index and its value at the 1^st index:

Removing elements: We can use the conventional del statement to delete any dictionary item. When we say delete, we mean delete both the key and the value. Dictionary items work in pairs, so deleting the key would remove the value as well. Another way to delete an entry is to use the pop() method and pass the key as an argument. This is shown in the following code snippet:

Tech Concepts

Programming languages

Tech Tools

Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

50+ new titles added per month and exclusive early access to books as they are being written.

Hands-On Penetration Testing with Python

By : Furqan Khan

Hands-On Penetration Testing with Python

By: Furqan Khan

Overview of this book

Python data types

Numbers

String types

String indexes

String operations through methods and built-in functions

The replace( ) method

Substrings or string slicing

String concatenation and replication

The strip(), lstrip(), and rstrip() methods

The split() method

The find(), index(), upper(), lower(), len(), and count() methods

The in and not in methods

The endswith(), isdigit(), isalpha(), islower(), isupper(), and capitalize() methods

List types

Slicing the lists

Merging and updating lists

Copying lists

Removing elements from lists

Replication with len(), max(), and min()

in and not in

Tuples in Python

Dictionaries in Python

Hands-On Penetration Testing with Python

By : Furqan Khan

Hands-On Penetration Testing with Python

By: Furqan Khan

Overview of this book

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access