Book Image

R Data Structures and Algorithms

By : PKS Prakash, Achyutuni Sri Krishna Rao
Book Image

R Data Structures and Algorithms

By: PKS Prakash, Achyutuni Sri Krishna Rao

Overview of this book

In this book, we cover not only classical data structures, but also functional data structures. We begin by answering the fundamental question: why data structures? We then move on to cover the relationship between data structures and algorithms, followed by an analysis and evaluation of algorithms. We introduce the fundamentals of data structures, such as lists, stacks, queues, and dictionaries, using real-world examples. We also cover topics such as indexing, sorting, and searching in depth. Later on, you will be exposed to advanced topics such as graph data structures, dynamic programming, and randomized algorithms. You will come to appreciate the intricacies of high performance and scalable programming using R. We also cover special R data structures such as vectors, data frames, and atomic vectors. With this easy-to-read book, you will be able to understand the power of linked lists, double linked lists, and circular linked lists. We will also explore the application of binary search and will go in depth into sorting algorithms such as bubble sort, selection sort, insertion sort, and merge sort.
Table of Contents (17 chapters)
R Data Structures and Algorithms
Credits
About the Authors
Acknowledgments
About the Reviewer
www.PacktPub.com
Preface

Abstract data type and data structure


The abstract data type (ADT) is used to define high-level features and operations for data structure, and is required before we dig down into details of implementation of data structure. For example, before implementing a linked list, it would be good to know what we want to do with the defined linked list, such as:

  • I should be able to insert an item into the linked list

  • I should be able to delete an item from the linked list

  • I should be able to search an item in the linked list

  • I should be able to see whether the linked list is empty or not

The defined ADT will then be utilized to define the implementation strategy. We will discuss ADT for different data structures in more detail later in this book. Before getting into definitions of ADT, it's important to understand data types and their characteristics, as they will be utilized to build an ecosystem for the data structures.

Data type is a way to classify various types of data such as Boolean, integer, float, and string. To efficiently classify a dataset, the following characteristics are needed in any data type:

  • Atomic: It should define a single concept

  • Traceable: It should be able to map with the same data type

  • Accurate: It should be unambiguous

  • Clear and concise: It should be understandable

Data types can be classified into two types:

  • Built-in data type

  • Derived data type

Data types for which a language has built-in support are known as built-in data types. For example, R supports the following data types:

  • Integers

  • Float

  • Boolean

  • Character

Data types which are obtained by integrating the built-in data types with associated operations such as insertion, deletion, sorting, merging, and so on are referred to as derived data types, such as:

  • List

  • Array

  • Stack

  • Queue

Derived data types or data structures are studied on two components:

  • Abstract data type or mathematical/logical models

  • Programming implementation

ADT is the realization of a data type in software. In ADT, we usually look into high-level features and operations used by users for a data structure, but do not know how these features run in the background. For example, say a user using banking software with search functionality is searching for the transaction history of Mr. Smith, using the search feature of the software. The user does not have any idea about the operations performed, or the implementation details of the data structures. Thus, the behavior of ADT is managed by input and output only:

Figure 1.3: Framework for ADT

Thus, ADT does not know how the data type is implemented, as these details are hidden by the user and protected from outside access, a concept referred to as encapsulation. A data structure is the implementation part of ADT, which can be implemented in any programming language. ADT can be achieved by multiple implementation strategies, as shown in Figure 1.4:

Figure 1.4: Illustration of stack and queue implementation using array with integer data type

The abstraction provided by ADT helps to manage programming complexity. The ADT is also referred to as logical form, as it decides the type and operations required for implementation. The ADT is implemented by using a particular type of data structure. The data structure used to implement ADT is referred to as the physical form of data type, as it manages storage using the implementation subroutine.