Book Image

Apache Ignite Quick Start Guide

By : Sujoy Acharya
Book Image

Apache Ignite Quick Start Guide

By: Sujoy Acharya

Overview of this book

Apache Ignite is a distributed in-memory platform designed to scale and process large volume of data. It can be integrated with microservices as well as monolithic systems, and can be used as a scalable, highly available and performant deployment platform for microservices. This book will teach you to use Apache Ignite for building a high-performance, scalable, highly available system architecture with data integrity. The book takes you through the basics of Apache Ignite and in-memory technologies. You will learn about installation and clustering Ignite nodes, caching topologies, and various caching strategies, such as cache aside, read and write through, and write behind. Next, you will delve into detailed aspects of Ignite’s data grid: web session clustering and querying data. You will learn how to process large volumes of data using compute grid and Ignite’s map-reduce and executor service. You will learn about the memory architecture of Apache Ignite and monitoring memory and caches. You will use Ignite for complex event processing, event streaming, and the time-series predictions of opportunities and threats. Additionally, you will go through off-heap and on-heap caching, swapping, and native and Spring framework integration with Apache Ignite. By the end of this book, you will be confident with all the features of Apache Ignite 2.x that can be used to build a high-performance system architecture.
Table of Contents (9 chapters)

Tuning performance

Performance tuning is important for improving the user experience. Apache Ignite makes network calls, communicates with other nodes, serializes objects, swaps entries from RAM to disk, rebalances data, stores objects on-heap and off-heap, creates WAL files, and more. Therefore, we need to keep an eye on different areas to improve the overall system performance. Ignite provides the following tips (https://apacheignite.readme.io/docs/performance-tips) to tune different areas:

  • Turn off back ups
  • Tune durable memory
  • Tune data rebalancing
  • Configure thread pools
  • Use collocated computations
  • Use data streamer
  • Batch up your messages
  • Tune garbage collection
  • Disable internal event notifications

Try avoiding distributed joins and take a key/value-based approach. The code bundle has an example to show you the difference between distributed SQLs and key/value-based algorithms...