Book Image

Learning Python Networking - Second Edition

By : José Manuel Ortega, Dr. M. O. Faruque Sarker, Sam Washington
Book Image

Learning Python Networking - Second Edition

By: José Manuel Ortega, Dr. M. O. Faruque Sarker, Sam Washington

Overview of this book

Network programming has always been a demanding task. With full-featured and well-documented libraries all the way up the stack, Python makes network programming the enjoyable experience it should be. Starting with a walk through of today's major networking protocols, through this book, you'll learn how to employ Python for network programming, how to request and retrieve web resources, and how to extract data in major formats over the web. You will utilize Python for emailing using different protocols, and you'll interact with remote systems and IP and DNS networking. You will cover the connection of networking devices and configuration using Python 3.7, along with cloud-based network management tasks using Python. As the book progresses, socket programming will be covered, followed by how to design servers, and the pros and cons of multithreaded and event-driven architectures. You'll develop practical clientside applications, including web API clients, email clients, SSH, and FTP. These applications will also be implemented through existing web application frameworks.
Table of Contents (19 chapters)
Free Chapter
1
Section 1: Introduction to Network and HTTP Programming
4
Section 2: Interacting with APIs, Web Scraping, and Server Scripting
9
Section 3: IP Address Manipulation and Network Automation
13
Section 4: Sockets and Server Programming

Protocol concepts and the problems that protocols solve

This section explains concepts regarding IP addresses and ports, network interfaces in a local machine, and other concepts related to protocols, such as Dynamic Host Configuration Protocol (DHCP) and DNS.

IP addresses and ports

IP addresses are addresses that help to uniquely identify a device over the internet. A port is an endpoint for communication in an operating system.

When you connect to the internet, your device is assigned a public IP address, and each website you visit also has a public IP address. So far, we have used IPv4 as an addressing system. The main problem with this is that the internet is running out of IPv4 public address space and so it is necessary to introduce IPv6, which provides a larger address space.

The following are the addresses for total IPv4 and IPv6 space:

  • Total IPv4 space: 4, 294, 967, 296 addresses
  • Total IPv6 space: 340, 282, 366, 920, 938, 463, 463, 374, 607, 431, 768, 211, 456 addresses

The ports are numerical values (between 0 and 65, 535) that are used to identify the processes that are being communicated. At each end, each process that intervenes in the communication process uses a single port to send and receive data.

In conjunction with this, two pairs of ports and IP addresses, you can identify two processes in a TCP/IP network. A system might be running thousands of services, but to uniquely identify a service on a system, the application requires a port number.

Port numbers are sometimes seen on the web or other URLs as well. By default, HTTP uses port 80, and HTTPS uses port 443, but a URL like http://www.domain.com:8080/path/ specifies that the web browser, instead of using default port 80, is connecting to port 8080 of the HTTP server.

Some common ports are as follows:

  • 22: Secure Shell (SSH)
  • 23: Telnet remote login service
  • 25: SMTP
  • 53: Domain Name System (DNS) service
  • 80: HTTP

Regarding IP addresses, we can differentiate two types, depending on whether they are for a public or private rank for the internal network of an organization:

  • Private IP address: Ranges from 192.168.0.0 to 192.168.255.255, 172.16.0.0 to 172.31.255.255, or 10.0.0.0 to 10.255.255.255
  • Public IP address: A public IP address is an IP address that your home or business router receives from your Internet Service Provider (ISP)

Network interfaces

You can find out what IP addresses have been assigned to your computer by running ip addr or ipconfig all on Windows systems, or on a Terminal.

If we run one of these commands, we will see that the IP addresses are assigned to our device's network interfaces. On Linux, these will have names, such as eth0; on Windows, these will have phrases, such as Ethernet adapter Local Area Connection.

You will get the following output when you run the ip addr command on Linux:

You will get the following options when you run the ipconfig command on Windows:

You will get IP addresses for the interfaces in your local machine when you run the ip addr command:

Every device has a virtual interface called the loopback interface, which you can see in the preceding listing as interface 1. This interface doesn't actually connect to anything outside the device, and only the device itself can communicate with it. While this may sound a little redundant, it's actually very useful when it comes to local network application testing, and it can also be used as a means of inter-process communication. The loopback interface is often referred to as localhost, and it is almost always assigned the IP address 127.0.0.1.

UDP versus TCP

The main difference between TCP and UDP is that TCP is oriented to connections, where once the connection is established, the data can be transmitted in both directions, while UDP is a simpler internet protocol, without the need for connections.

Now, we have to analyze the differences according to certain features:

  • Differences in data transfer: TCP ensures the orderly and reliable delivery of a series of data from the user to the server and vice versa. UDP is not dedicated to point-to-point connections and does not verify the availability of whoever receives the data.
  • Reliability: TCP is more reliable because it manages to recognize that the message was received and retransmits the packets that have been lost. UDP does not verify what the communication has produced because it does not have the ability to check the connection and retransmit the packets.
  • Connection: TCP is a protocol that's oriented toward the congestion control of the network and the reliability of the frames, while UDP is a non-connection oriented protocol that's designed to establish a rapid exchange of packets without the need to know whether the packets are arriving correctly.
  • Transfer method: TCP reads data as a sequence and the message is transmitted in defined segments. UDP messages are data packets that are sent individually and their integrity is verified upon arrival.
  • How TCP and UDP work: A TCP connection is established through the process of starting and verifying a connection. Once the connection has been established, it is possible to start the data transfer, and once the transfer is complete, the connection is completed by closing the established virtual circuits. UDP provides an unreliable service and the data may arrive unordered, duplicated, or incomplete, and it doesn't notify either the sender or receiver. UDP assumes that corrections and error checking are not necessary, avoiding the use of resources in the network interface.
  • TCP and UDP applications: TCP is used mainly when you need to use error correction mechanisms in the network interface, while UDP is mainly used in applications based on small requests from a large number of clients, for example, DNS and Voice Over IP (VoIP).

DHCP

IP addresses can be assigned to a device by a network administrator in one of two ways: statically, where the device's operating system is manually configured with the IP address, or dynamically, where the device's operating system is configured by using the DHCP.

When using DHCP, as soon as the device first connects to a network, it is automatically allocated an address by a DHCP server from a predefined pool. Some network devices, such as home broadband routers, provide a DHCP server service out of the box; otherwise, a DHCP server must be set up by a network administrator. DHCP is widely deployed, and it is particularly useful for networks where different devices may frequently connect and disconnect, such as public Wi-Fi hotspots or mobile networks.

DHCP environments require a DHCP server that's been configured with the appropriate parameters for the proposed network. The main DHCP parameters include the range or pool of available IP addresses, the correct subnet masks, and the gateway and server name addresses.

A DHCP server dynamically allocates IP addresses instead of having to depend on the static IP address and is responsible for assigning, leasing, reallocating, and renewing IP addresses. The protocol will assign an address that is available in a subnet or pool. This means that a new device can be added to a network without you having to manually assign it a unique IP address. DHCP can also combine static and dynamic IPs, and also determines how long an IP address is assigned to a device.

When a computer in a network wants to obtain a valid network configuration, usually when starting up the machine, it issues a DHCP Discover request. When this request—which is made through a UDP broadcast packet—reaches a DHCP server, a negotiation is established whereby the server grants the use of an IP, and other network parameters, to the client for a certain time.

It is important to take note of the following:

  • The client does not need to have the network interface configured to issue a DHCP Discover request.
  • The DHCP server can be on the same or a different subnet as the client will be on. If the client does not have network configuration, it cannot reach other subnets.
  • When the DHCP server receives the DHCP request, Discover obtains the Mac address of the client, which may affect the IP address assigned to the client.
  • The DHCP server grants network configuration to the client for a certain time. Before reaching the deadline, the client may try to renew the concession. If a concession occurs, the client must stop using the network configuration.

To make a DHCP request, you can use a client such as dhclient (native GNU/Linux) or the ipconfig/renew command (in the case of Windows). When a network configuration is obtained, the client uses it:

DNS

DNS allows for the association of domain names with IP addresses, which greatly facilitates access to the machines on the network. Without DNS, referring to a machine implies remembering your IP address. Working directly with IP addresses is not comfortable, because they are difficult to remember and because the IP address of a station can vary for different reasons. Whoever uses the domain name does not need to worry about these changes (although the DNS server must know the real IP in each case).

The domain name system is a distributed and hierarchical database, and although its main function is to associate domain names with IP addresses, it can also store other information. The DNS service is one of the pillars of the network, so its availability must be absolute. To achieve this, redundant servers are used and extensive caching is used to improve their performance.

The nslookup tool comes with most Linux and Windows systems and lets us query DNS on the command line, as follows:

We can use this command to request the IP address for the packtpub.com domain:

With this command, we determined that the packtpub.com host has the IP address 83.166.169.231. DNS distributes the work of looking up hostnames by using a hierarchical system of caching servers. Internet DNS services are a set of databases that are scattered on servers around the world. These databases indicate the IP that is associated with a name of a website. When we enter an address in the search engine, for example, packtpub.com, the computer asks the DNS servers of the internet provider to find the IP address associated with packtpub.com. If the servers do not have that information, a search is made with other servers that may have it.

When we run our preferred browser and write a web address in its address bar to access the content that's hosted on the site, the DNS service will translate these names into elements that can be understood and used for the equipment and systems that make up the internet.

On Windows computers, this system is configured by default to automatically use the DNS server of our internet service provider. At this point, we may have different DNS providers such as OpenDNS, UltraDNS, or Google DNS as an alternative, but we must always keep in mind that these providers offer us minimum security conditions to navigate. More information about configuration using Google DNS can be found at the following URL: https://developers.google.com/speed/public-dns/.