Book Image

Hands-On Dark Web Analysis

By : Sion Retzkin
Book Image

Hands-On Dark Web Analysis

By: Sion Retzkin

Overview of this book

The overall world wide web is divided into three main areas - the Surface Web, the Deep Web, and the Dark Web. The Deep Web and Dark Web are the two areas which are not accessible through standard search engines or browsers. It becomes extremely important for security professionals to have control over these areas to analyze the security of your organization. This book will initially introduce you to the concept of the Deep Web and the Dark Web and their significance in the security sector. Then we will deep dive into installing operating systems and Tor Browser for privacy, security and anonymity while accessing them. During the course of the book, we will also share some best practices which will be useful in using the tools for best effect. By the end of this book, you will have hands-on experience working with the Deep Web and the Dark Web for security analysis
Table of Contents (18 chapters)
Title Page
About Packt

The origin of the internet

Many of you might have heard that the internet was originally created by DARPA, The Defense Advanced Research Projects Agency, which is part of the United States Department of Defense, and is responsible for development of new technologies for use by the US military.

But, this is not necessarily the first appearance of the internet. Back then, it was more of an intra-net, as all the computers were on the same network. It all depends on how you define the internet. Nowadays, the internet is defined as a network of networks or multiple interconnected computer networks that provide communication and information capabilities, using standardized communication protocols such as Transport Control Protocol/Internet Protocol (TCP/IP )

Some say that the internet began after packet switching technology was created, others say it was after TCP/IP was launched, and yet others claim that the origins of the internet were in the UK, not in the US.

The starting date of the internet (or rather ARPANET, as it was known then) is also inconclusive. Although most people agree that it was launched in 1969, there is concrete evidence that it originated even earlier.

The following diagram is a model of ARPANET from 1982:

In August 1962, J.C.R. Licklider of MIT began discussing his Galactic Network concept. His idea was to create a globally interconnected set of computers through which anyone could quickly access data and programs from anywhere in the world.

This was based on packet switching technology, a way by which messages can travel from point to point across a network. He even got to the point where he implemented a packet switch connecting a set of host computers. This technology was already a concept in 1965, proposed by an Englishman called Donald Davies, but it never got funded. ARPANET adopted his ideas and continued from there.

Additionally, a Frenchman called Louis Pouzin introduced the idea of datagrams (data + telegram—a basic transfer unit in a packet-switched network) around that time.

In 1968, The National Physical Laboratory in the UK set up the first test network for packet switching. This inspired DARPA to work on ARPANET.

Whatever the origin of the internet, the original intent of ARPANET was to allow people in remote locations to use the processing power of remote computers for scientific calculations.

In December 1970, the initial ARPANET host-to-host protocol, called the Network Control Protocol (NCP), was added to the network, and in 1972, email technology was introduced.

Additionally, in 1972, the concept of open-architecture networking was introduced, providing the basis for networks of different technologies to be able to connect (this also sowed the seeds for the OSI model in the future).

In 1978, TCP/IPv4 was released, and was added to ARPANET in 1983. This was the first actual internet, the basis of the internet which we know and love today.

So, what is the internet?

As shown in the above diagram, it's a vast, global network of interconnected networks that uses TCP/IP to communicate.

There are literally millions upon millions of networks connected, and nowadays the networks are no longer only computer-based. internet of Things (IoT) technology connects devices that aren't computers to the internet, as well.

You may also be familiar with the term World Wide Web.

Also called simply the web, it's a way of accessing information, such as web resources and documents, by accessing Uniform Resource Locators (URLs) and Hypertext links, using various protocols (such as Hypertext Transfer Protocol (HTTP)) to allow applications (such as a browser) to access and share information.

A protocol is a set of rules that dictates how to format, transmit, and receive data so network devices can communicate, regardless of their infrastructure, design, or standards.

Browsers were created in 1990 by English scientist Tim Berners-Lee during his employment at CERN, in Switzerland.

The internet is the infrastructure upon which the World Wide Web can be used.

Now, after we've understood what the internet is and how it began, let's talk about the Deep Web.

As you know, Google and other search engines (Bing, Yahoo, and so on) index sites by crawling them and incorporating the data crawled into their index servers. The search engines then organize the data by context, according to their logic, and enter the data into a base of algorithms that make up the search engine.

This data, indexed by a search engine, and accessed via the World Wide Web (also called the Surface Web), is actually only a small part of the entire internet.

Many people like to view the internet as an island or a glacier at sea, and only part of it is viewable above the surface of the water:

Surface Web, Deep Web, and Dark Net

As you can see in the preceding diagram , the Surface Web is the tip that's visible above the water.

This area can be indexed by search engines, and contains all the publicly available information, documents, and content.

Sadly, many people who aren't tech-savvy or aren't aware, and even companies, allow their data to be indexed, which provides information to attackers, helping them gain access, locate files and data, and more.

For example, an attacker might want to cause reputational damage to a certain business. Performing reconnaissance, the attacker discovers a weakness to be exploited—the business' backups procedure saves the backup of their customer database to their public website for 24 hours before it is moved to a secure location. This allows the backup to be crawled by search engines. The attacker can use a search engine to find the database file on the business' website. Since the website is indexed, the search engine is able to provide the results to the attacker. The attacker can then simply download and use the file(s) for malicious purposes.