Book Image

Hands-On Network Programming with C

By : Lewis Van Winkle
Book Image

Hands-On Network Programming with C

By: Lewis Van Winkle

Overview of this book

Network programming enables processes to communicate with each other over a computer network, but it is a complex task that requires programming with multiple libraries and protocols. With its support for third-party libraries and structured documentation, C is an ideal language to write network programs. Complete with step-by-step explanations of essential concepts and practical examples, this C network programming book begins with the fundamentals of Internet Protocol, TCP, and UDP. You’ll explore client-server and peer-to-peer models for information sharing and connectivity with remote computers. The book will also cover HTTP and HTTPS for communicating between your browser and website, and delve into hostname resolution with DNS, which is crucial to the functioning of the modern web. As you advance, you’ll gain insights into asynchronous socket programming and streams, and explore debugging and error handling. Finally, you’ll study network monitoring and implement security best practices. By the end of this book, you’ll have experience of working with client-server applications and be able to implement new network programs in C. The code in this book is compatible with the older C99 version as well as the latest C18 and C++17 standards. You’ll work with robust, reliable, and secure code that is portable across operating systems, including Winsock sockets for Windows and POSIX sockets for Linux and macOS.
Table of Contents (26 chapters)
Title Page
Dedication
About Packt
Contributors
Preface
Index

Internet routing


If all networks contained only a maximum of only two devices, then there would be no need for routing. Computer A would just send its data directly over the wire, and computer B would receive it as the only possibility:

The internet today has an estimated 20 billion devices connected. When you make a connection over the internet, your data first transmits to your local router. From there, it is transmitted to another router, which is connected to another router, and so on. Eventually, your data reaches a router that is connected to the receiving device, at which point, the data has reached its destination:

Imagine that each router in the preceding diagram is connected to tens, hundreds, or even thousands of other routers and systems. It's an amazing feat that IP can discover the correct path and deliver traffic seamlessly.

Windows includes a utility, tracert, which lists the routers between your system and the destination system.

Here is an example of using the tracert command on Windows 10 to trace the route to example.com:

As you can see from the example, there are 11 hops between our system and the destination system (example.com, 93.184.216.34). The IP addresses are listed for many of these intermediate routers, but a few are missing with the Request timed out message. This usually means that the system in question doesn't support the part of the Internet Control Message Protocol (ICMP) protocol needed. It's not unusual to see a few such systems when running tracert.

In Unix-based systems, the utility to trace routes is called traceroute. You would use it like traceroute example.com, for example, but the information obtained is essentially the same.

More information on tracert and traceroute can be found in Chapter 12Network Monitoring and Security.

Sometimes, when IP packets are transferred between networks, their addresses must be translated. This is especially common when using IPv4. Let's look at the mechanism for this next.

Local networks and address translation

It's common for households and organizations to have small Local Area Networks (LANs). As mentioned previously, there are IPv4 addresses ranges reserved for use in these small local networks. 

These reserved private ranges are as follows:

  • 10.0.0.0 to 10.255.255.255
  • 172.16.0.0 to 172.31.255.255
  • 192.168.0.0 to 192.168.255.255

When a packet originates from a device on an IPv4 local network, it must undergoNetwork Address Translation(NAT) before being routed on the internet. A router that implements NAT remembers which local address a connection is established from.

The devices on the same LAN can directly address one another by their local address. However, any traffic communicated to the internet must undergo address translation by the router. The router does this by modifying the source IP address from the original private LAN IP address to its public internet IP address:

Likewise, when the router receives the return communication, it must modify the destination address from its public IP to the private IP of the original sender. It knows the private IP address because it was stored in memory after the first outgoing packet:

Network address translation can be more complicated than it first appears. In addition to modifying the source IP address in the packet, it must also update the checksums in the packet. Otherwise, the packet would be detected as containing errors and discarded by the next router. The NAT router must also remember which private IP address sent the packet in order to route the reply. Without remembering the translation address, the NAT router wouldn't know where to send the reply to on the private network.

NATs will also modify the packet data in some cases. For example, in the File Transfer Protocol (FTP), some connection information is sent as part of the packet's data. In these cases, the NAT router will look at the packet's data in order to know how to forward future incoming packets. IPv6 largely avoids the need for NAT, as it is possible (and common) for each device to have its own publicly-addressable address.

You may be wondering how a router knows whether a message is locally deliverable or whether it must be forwarded. This is done using a netmask, subnet mask, or CIDR.

Subnetting and CIDR

IP addresses can be split into parts. The most significant bits are used to identify the network or subnetwork, and the least significant bits are used to identify the specific device on the network.

This is similar to how your home address can be split into parts. Your home address includes a house number, a street name, and a city. The city is analogous to the network part, the street name could be the subnetwork part, and your house number is the device part.

IPv4 traditionally uses a mask notation to identify the IP address parts. For example, consider a router on the 10.0.0.0 network with a subnet mask of 255.255.255.0. This router can take any incoming packet and perform a bitwise AND operation with the subnet mask to determine whether the packet belongs on the local subnet or needs to be forwarded on. For example, this router receives a packet to be delivered to 10.0.0.105. It does a bitwise AND operation on this address with the subnet mask of 255.255.255.0, which produces 10.0.0.0. That matches the subnet of the router, so the traffic is local. If, instead, we consider a packet destined for 10.0.15.22, the result of the bitwise AND with the subnet mask is 10.0.15.0. This address doesn't match the subnet the router is on, and so it must be forwarded.

IPv6 uses CIDR. Networks and subnetworks are specified using the CIDR notation we described earlier. For example, if the IPv6 subnet is /112, then the router knows that any address that matches on the first 112 bits is on the local subnet.

So far, we've covered only routing with one sender and one receiver. While this is the most common situation, let's consider alternative cases too.

Multicast, broadcast, and anycast

When a packet is routed from one sender to one receiver, it uses unicast addressing. This is the simplest and most common type of addressing. All of the protocols we deal with in this book use unicast addressing.

Broadcast addressing allows a single sender to address a packet to all recipients simultaneously. It is typically used to deliver a packet to every receiver on an entire subnet.

If a broadcast is a one-to-all communication, then multicast is a one-to-many communication. Multicast involves some group management, and a message is addressed and delivered to members of a group.

Anycast addressed packets are used to deliver a message to one recipient when you don't care who that recipient is. This is useful if you have several servers that provide the same functionality, and you simply want one of them (you don't care which) to handle your request.

IPv4 and lower network levels support local broadcast addressing. IPv4 provides some optional (but commonly implemented) support for multicasting. IPv6 mandates multicasting support while providing additional features over IPv4's multicasting. Though IPv6 is not considered to broadcast, its multicasting functionality can essentially emulate it.

It's worth noting that these alternative addressing methods don't generally work over the broader internet. Imagine if one peer was able to broadcast a packet to every connected internet device. It would be a mess!

If you can use IP multicasting on your local network, though, it is worthwhile to implement it. Sending one IP level multicast conserves bandwidth compared to sending the same unicast message multiple times.

However, multicasting is often done at the application level. That is, when the application wants to deliver the same message to several recipients, it sends the message multiple times – once to each recipient. In Chapter 3An In-Depth Overview of TCP Connections, we build a chat room. This chat room could be said to use application-level multicasting, but it does not take advantage of IP multicasting.

We've covered how messages are routed through a network. Now, let's see how a message knows which application is responsible for it once it arrives at a specific system.