Book Image

Distributed Computing in Java 9

Book Image

Distributed Computing in Java 9

Overview of this book

Distributed computing is the concept with which a bigger computation process is accomplished by splitting it into multiple smaller logical activities and performed by diverse systems, resulting in maximized performance in lower infrastructure investment. This book will teach you how to improve the performance of traditional applications through the usage of parallelism and optimized resource utilization in Java 9. After a brief introduction to the fundamentals of distributed and parallel computing, the book moves on to explain different ways of communicating with remote systems/objects in a distributed architecture. You will learn about asynchronous messaging with enterprise integration and related patterns, and how to handle large amount of data using HPC and implement distributed computing for databases. Moving on, it explains how to deploy distributed applications on different cloud platforms and self-contained application development. You will also learn about big data technologies and understand how they contribute to distributed computing. The book concludes with the detailed coverage of testing, debugging, troubleshooting, and security aspects of distributed applications so the programs you build are robust, efficient, and secure.
Table of Contents (17 chapters)
Title Page
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
Customer Feedback
2
Communication between Distributed Applications
3
RMI, CORBA, and JavaSpaces

Hadoop, MapReduce, and HDFS


As discussed in the previous sections, the rapidly increasing data storage, analysis, and process requirements are in the necessity of mining the essential information for business needs from such huge volume of data in storage clusters and data-intensive applications. Scalability, high availability, fault tolerance, data distribution, parallel processing, and load balancing are the expected features of such a system.

These features of big data are addressed by the MapReduce program introduced by Google.

Hadoop

Hadoop is the most prevalent and open source execution of the MapReduce programing model. Apache Hadoop is a scalable and reliable software framework for parallel and distributed computing. Instead of depending on expensive resources for storage and processing huge volume of data, Hadoop allows big data parallel processing on inexpensive commodity hardware. The following diagram represents the components of the Hadoop architecture:

Apache Hadoop contains five...