Distributed Computing in Java 9

Distributed Computing in Java 9

Overview of this book

Distributed computing is the concept with which a bigger computation process is accomplished by splitting it into multiple smaller logical activities and performed by diverse systems, resulting in maximized performance in lower infrastructure investment. This book will teach you how to improve the performance of traditional applications through the usage of parallelism and optimized resource utilization in Java 9. After a brief introduction to the fundamentals of distributed and parallel computing, the book moves on to explain different ways of communicating with remote systems/objects in a distributed architecture. You will learn about asynchronous messaging with enterprise integration and related patterns, and how to handle large amount of data using HPC and implement distributed computing for databases. Moving on, it explains how to deploy distributed applications on different cloud platforms and self-contained application development. You will also learn about big data technologies and understand how they contribute to distributed computing. The book concludes with the detailed coverage of testing, debugging, troubleshooting, and security aspects of distributed applications so the programs you build are robust, efficient, and secure.

Title Page

Credits

About the Author

About the Reviewer

www.PacktPub.com

Preface

Customer Feedback

Free Chapter

Quick Start to Distributed Computing

Parallel computing

Distributed computing

Parallel versus distributed computing

Design considerations for distributed systems

Summary

Communication between Distributed Applications

Client-server communication

RMI, CORBA, and JavaSpaces

RMI

JavaSpaces

Enterprise Messaging

EMS

JMS

Web services

Enterprise integration patterns

HPC Cluster Computing

Era of computing

Commanding parallel system architectures

Java support for high-performance computing

Java support for parallel programming models

Java 9 updates for processing an API

Summary

Distributed Databases

Distributed and decentralized databases

Distributed database environments

Distributed database setup methodologies

Distributed DBMS architecture

Java Database Connectivity

Summary

Cloud and Distributed Computing

What is cloud computing?

Features of cloud computing

Cloud versus distributed computing

Cloud service providers

AWS

Docker CaaS

CaaS

Java 9 support

Summary

Big Data Analytics

What is big data?

Big data characteristics

NoSQL databases

Hadoop, MapReduce, and HDFS

Distributed computing for big data

ZooKeeper for distributed computing

Summary

Testing, Debugging, and Troubleshooting

Challenges in testing distributed applications

Standard testing approach in software systems

Cloud distributed application testing

Latest tools for testing Java distributed applications

Debugging and troubleshooting distributed applications

Summary

Security

Security issues and concerns

Two-way Secure Sockets Layer (SSL) implementation

Cloud computing security

Security enhancements in Java 9

Summary

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Design considerations for distributed systems

Following are some of the characteristics of distributed systems that should be considered in designing a project in a distributed environment:

No global clock: Being distributed across the world, distributed systems cannot be expected to have a common clock, and this gives a chance for the intrinsic asynchrony between the processors performing the computing. Distributed system coordination usually depends on a shared idea of the time at which the programs or business state occurs. However, with distributed systems, having no global clock, it is a challenge to attain the accuracy with which the computers in the network can synchronize their clocks to reflect the time at which the expected program execution happened. This limitation expects the systems in the network to communicate through messages instead of time-based events.

Geographical distribution: The individual systems taking a part in distributed system are expected to be connected through a network, previously through a Wide-Area Network (WAN), and now with a Network Of Workstations/Cluster Of Workstations (NOW/COW). An in-house distributed system is expected to be configured within a LAN connectivity. NOW is becoming widespread in the market with its low-cost, high-speed, off-the-shelf processing capability. Most popular NOW architectures include the Google search engine, Amazon.

No shared memory: An important and key feature of distributed computing and the message-passing model of communication is having no shared memory, which also infers the nonexistence of a common physical clock.

Independence and heterogeneity: The distributed system processors are loosely coupled so that they have their own individual capabilities in terms of speed and method of execution with versatile operating systems. They are not expected to be part of a dedicated system; however, they cooperate with one another by exposing the services and/or executing the tasks together as subtasks.

Fail-over mechanism: We often see computer systems failing, and it is the design responsibility of setting the expected behavior with the consequence of possible failures. Distributed systems are observed to be failed in integration as well as the individual sub systems. A fault in the network can result in the isolation of an individual or a group of computers in the distributed system; however, they might still be executing the programs they are expected to execute. In reality, the individual programs may not be able to detect such network failures or timeouts. Similarly, the failure of a particular computer, a system being terminated abruptly with an abrupt program or system failure, may not immediately be known by the other systems/components in the network with which the failed computer usually communicates with. The consequences of this characteristic of distributed systems has to be captured in the system design.

Security concerns: Distributed systems being set up on a shared Internet are prone to more unauthorized attacks and vulnerabilities.

Distributed systems are becoming increasingly popular with their ability to allow the polling of resources, including CPU cycles, data storage, devices and services becoming increasingly economical. Distributed systems are more reliable as they allow replication of resources and services, which reduces service outages due to individual system failures. Cost, speed, and availability of Internet are making it a decent platform on which to maintain distributed systems.

Java support

From a standalone application to web applications to the sophisticated cloud integration of enterprise, Java has been updating itself to accommodate various features that support the change. Especially, frameworks like Spring have come up with modules like Spring Boot, Batch, and Integration, which comply with most of the cloud integration features. As a language, Java has a great support for programs to be written using multithreaded distributed objects. In this model, an application contains numerous heavyweight processes that communicate using messaging or Remote Method Invocations (RMI). Each heavyweight process contains several lightweight processes, which are termed as threads in Java. Threads can communicate through the shared memory. Such software architecture reflects the hardware that is configured to be extensively accessible.

By assuming that there is, at most, one thread per process or by ignoring the parallelism within one process, it is the usual model of a distributed system. The purpose of making the logically simple is that the distributed program is more object-oriented because data in a remote object can be accessed only through an explicit messaging or a remote procedure call (RPC).

The object-orientated model promotes reusability as well as design simplicity. Furthermore, a large shared data structure has the requirement of shared processing, which is possible through object orientation and letting the process of execution be multithreaded. The programming should carry the responsibility of splitting the larger data structure across multiple heavyweight processes.

Programming language, which wants to support concurrent programming, should be able to instruct the process structure, and how several processes communicate with each other and synchronize. There are many ways the Java program can specify the process structure or create a new process. For example, UNIX processes are tree structured containing a unique process ID (pid) for each process. fork and wait are the commands to create and synchronize the processes. The fork command creates a child process from a parent process with a parent process address space copy:

pid = fork();
if (pid != 0 ) {
  cout << "This is a parent process";
}
else {
  cout << "This is a child process"; 
}

Java has a predefined class called Thread to enable concurrency through creating thread objects. A class can extend the Thread class if it should be executed in a separate thread, override the run() method, and execute the start() method to launch that thread:

public class NewThread extends Thread {
  public void run() {
    System.out.println("New Thread executing!");
  }
  public static void main(String[] args) {
    Thread t1 = new NewThread();
    t1.start();
  }
}

In the cases where a class has to extend another class and execute as a new thread, Java supports this behavior through the interface Runnable, as shown in the following example:

public class Animal {
  String name;
  public Animal(String name) {
    this.name = name;
  }
  public void setName(String name) {
    this.name = name;
  }
  public String getName() {
    return this.name;
  }
}
public class Mammal extends Animal implements Runnable {
  public Mammal(String name) {
    super(name);
  }
  public void run() {
    for (int i = 0; i < 100; i++) {
      System.out.println("The name of the Animal is : " + this.getName());
    }
  }
  public static void main(String[] args) {
    Animal firstAnimal = new Mammal("Tiger");
    Thread threadOne = new Thread((Runnable) firstAnimal);
    threadOne.start();
    Animal secondAnimal = new Mammal("Elephant");
    Thread threadTwo = new Thread((Runnable) secondAnimal);
    threadTwo.start();
  }
}

In the following example of Fibonacci numbers, a thread waits for completing the execution of other threads using the Join mechanism. Threads can carry the priority as well to set the importance of one thread over the other to execute before:

public class Fib extends Thread
{
  private int x;
  public int answer;
  public Fib(int x) {
    this.x = x;
  }
  public void run() {
    if( x <= 2 )
    answer = 1;
    else {
      try {
        Fib f1 = new Fib(x-1);
        Fib f2 = new Fib(x-2);
        f1.start();
        f2.start();
        f1.join();
        f2.join();
        answer = f1.answer + f2.answer;
      }
      catch(InterruptedException ex) { }
    }
  }
  public static void main(String[] args)  throws Exception
  {
    try {
      Fib f = new Fib( Integer.parseInt(args[0]) );
      f.start();
      f.join();
      System.out.println(f.answer);
    }
    catch(Exception ex) {
      System.err.println("usage: java Fib NUMBER");
    }
  }
}

With the latest Java version, a Callable interface is introduced with a @FunctionalInterface annotation. With the help of this feature, we can create Callable objects using lambda expressions as follows:

Callable<Integer> callableObject =()->{return 5 + 9;};

The preceding expression is equivalent to the following code:

Callable<Integer> callableObject =newCallable<Integer>(){
  @Override
  publicInteger call()throwsException{
    return5 + 6;
  }
};

Following is the complete example with Callable and Future interfaces and lambda expressions for handing concurrent processing in Java 9:

package threads;

import java.util.Arrays;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;

public class JavaCallableThreads {

  public static void main(String[] args) {
    final List<Integer> numbers = Arrays.asList(1,2,3,4,5);
    Callable<Integer> callableObject = () -> {
      int sum = numbers.stream().mapToInt(i -> i.intValue()).sum();
      return sum;
    };
    ExecutorService exService = Executors.newSingleThreadExecutor();
    Future<Integer> futureObj = exService.submit(callableObject);
    Integer futureSum=0;
    try {
    futureSum = futureObj.get();
  } catch (InterruptedException e) {
    e.printStackTrace();
  } catch (ExecutionException e) {
    e.printStackTrace();
  }
    System.out.println("Sum returned = " + futureSum);
  }

}

Modern Java enterprise applications have evolved through messaging (through message queue), web services, and writing microservices based distributed application like docker with applications deployed on cloud computing services like RedHat OpenShift, Amazon Web Services (AWS), Google App Engine and Kubernetes.

We will discuss the Java 9 support for such application development and deployment in detail in the coming chapters.

Distributed Computing in Java 9

Distributed Computing in Java 9

Overview of this book

Related Content you might be interested in

Current Title:

Distributed Computing in Java 9

Mastering Java 9

Mastering Java 11

Design considerations for distributed systems

Java support