Persistence Best Practices for Java Applications

By : Otavio Santana, Karina Varela

Persistence Best Practices for Java Applications

By: Otavio Santana, Karina Varela

Overview of this book

Having a solid software architecture breathes life into tech solutions. In the early stages of an application’s development, critical decisions need to be made, such as whether to go for microservices, a monolithic architecture, the event-driven approach, or containerization. In Java contexts, frameworks and runtimes also need to be defi ned. But one aspect is often overlooked – the persistence layer – which plays a vital role similar to that of data stores in modern cloud-native solutions. To optimize applications and data stores, a holistic understanding of best practices, technologies, and existing approaches is crucial. This book presents well-established patterns and standards that can be used in Java solutions, with valuable insights into the pros and cons of trending technologies and frameworks used in cloud-native microservices, alongside good Java coding practices. As you progress, you’ll confront the challenges of cloud adoption head-on, particularly those tied to the growing need for cost reduction through stack modernization. Within these pages, you’ll discover application modernization strategies and learn how enterprise data integration patterns and event-driven architectures enable smooth modernization processes with low-to-zero impact on the existing legacy stack.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Conventions used

Get in touch

Share Your Thoughts

Download a free PDF copy of this book

Part 1: Persistence in Cloud Computing – Storing and Managing Data in Modern Software Architecture

Free Chapter

Chapter 1: The History of Data Storage – From the Caves to the Cloud

Why do databases exist?

Characteristics of Java persistence frameworks

The cloud’s effect on stateful solutions

Exploring the trade-offs of distributed database systems – a look into the CAP theorem and beyond

Summary

Chapter 2: Exploring the Multiple Database Flavors

A look back at relational databases

A deep dive into non-relational databases (NoSQL)

NewSQL databases – trying to get the best out of both worlds

Summary

Chapter 3: Exploring Architectural Strategies and Cloud Usage

The cloud’s influence on software architecture design

Cloud deployment strategies that favor modern stateful solutions

Distributed systems and their impact on data systems

Summary

Chapter 4: Design Patterns for Data Management in Cloud-Native Applications

Technical requirements

Design patterns applied to the Java persistence layer

Navigating the Java mapping landscape – evaluating framework trade-offs

Data transfer between the view and underlying layers

Summary

Part 2: Jakarta EE, MicroProfile, Modern Persistence Technologies, and Their Trade-Offs

Chapter 5: Jakarta EE and JPA – State of Affairs

Technical requirements

Jakarta EE overview

Framework unveiled – reflection versus reflectionless solutions

JPA state of affairs

The power of JPA with Quarkus and Panache cloud-native runtimes

Summary

Chapter 6: NoSQL in Java Demystified – One API to Rule Them All

Technical requirements

Understanding NoSQL database trade-offs

Consuming NoSQL databases with JNoSQL

Summary

Chapter 7: The Missing Guide for jOOQ Adoption

Technical requirements

Summary

Chapter 8: Ultra-Fast In-Memory Persistence with Eclipse Store

Technical requirements

Object-relational impedance mismatch explained

In-memory persistence storage – Eclipse Store

Using in-memory data storage with Jakarta EE and MicroProfile

Summary

Part 3: Architectural Perspective over Persistence

Chapter 9: Persistence Practices – Exploring Polyglot Persistence

Technical requirements

The trade-offs of polyglot persistence

Understanding DDD and Jakarta

Jakarta Data

Summary

Chapter 10: Architecting Distributed Systems – Challenges and Anti-Patterns

Data integration scales and distributed transactions

The dual-write anti-pattern

Microservices and shared databases

Eventual consistency problems

Summary

Chapter 11: Modernization Strategies and Data Integration

Application modernization strategies

Avoiding data storage-related anti-patterns and bad practices

Introduction to CDC pattern

Adopting cloud technologies and cloud services

Summary

Chapter 12: Final Considerations

The power of tests - How to lead with data-domain tests

Do not underestimate the importance of documentation

Architecture without architects

Summary

Exploring the trade-offs of distributed database systems – a look into the CAP theorem and beyond

If the perfect Distributed Database System (DDBS) were to be described, it would certainly be a database that was highly scalable, provided perfectly consistent data, and didn’t require too much attention in regard to management (tasks such as backup, migrations, and managing the network). Unfortunately, the CAP theorem, formulated by Eric Brewer, states that that’s not possible.

Note

To date, there is no database solution that can provide the ideal combination of features such as total data consistency, high availability, and scalability all together.

For details, check: Towards robust distributed systems. PODC. 7. 10.1145/343477.343502 (https://www.researchgate.net/publication/221343719_Towards_robust_distributed_systems).

The CAP theorem is a way of understanding the trade-offs between different properties of a DDBS. Eric Brewer, at the 2000 Symposium on Principles of Distributed Computing (PODC), conjectured that when creating a DDBS, “you can have at most two of these properties for any shared-data system,” referring to the properties consistency, availability, and tolerance to network partitions.

Figure 1.2 – Representation inspired by Eric Brewer’s keynote presentation

Note

Towards Robust Distributed Systems. For more information on Eric Brewer’s work, refer to Brewer, Eric. (2000), presentation: https://people.eecs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf.

The three characteristics described in the CAP theorem can be described as follows:

Consistency: The guarantee that every node in a distributed cluster returns the same, most recent, successful write.
Availability: Every non-failing node returns a response for all read and write requests in a reasonable amount of time.
Partition tolerance: The system continues to function and uphold its consistency guarantees despite network partitions. In other words, the service is running despite crashes, disk failures, database, software, and OS upgrades, power outages, and other factors.

In other words, the DDBSes we can pick and choose from would only be CA (consistent and highly available), CP (consistent and partition-tolerant), or AP (highly available and partition-tolerant).

Tip

As stressed in the book Fundamentals of Software Architecture: An Engineering Approach, good software architecture requires dealing with trade-offs. This is yet another trade-off to take into consideration (https://www.amazon.com/Fundamentals-Software-Architecture-Engineering-Approach-ebook/dp/B0849MPK73/).

By considering the CAP theorem, we can then apply this new knowledge to back us up in decision-making processes in regard to choosing between SQL and NoSQL. For example, traditional DBMSes thrive when (mostly) providing the Atomicity, Consistency, Isolation, and Durability (ACID) properties; however, in regard to distributed systems, it may be necessary to give up consistency and isolation in order to achieve higher availability and better performance. This is commonly known as sacrificing consistency for availability.

Almost 12 years after the idea of CAP was proposed, Seth Gilbert and Nancy Lynch at MIT published some research, a formal proof of Brewer’s conjecture. However, another expert on database system architecture and implementation has also done some research on scalable and distributed systems, adding, to the existing theorem, the consideration of the consistency and latency trade-off.

In 2012, Prof. Daniel Abadi published a study stating CAP has become “increasingly misunderstood and misapplied, causing significant harm” leading to unnecessarily limited Distributed Database Management System (DDBMS) creation, as CAP only presents limitations in the face of certain types of failures – not during normal operations.

Abadi’s paper Consistency Tradeoffs in Modern Distributed Database System Design proposes a new formulation, Performance and Consistency Elasticity Capabilities (PACELC), which argues that the trade-offs between consistency and performance can be managed through the use of elasticity. The following question quoted in the paper clarifies the main idea: “If there is a partition (P), how does the system trade off availability and consistency (A and C); else (E), when the system is running normally in the absence of partitions, how does the system trade off latency (L) and consistency (C)?”

According to Abadi, a distributed database could be both highly consistent and highly performant, but only under certain conditions – only when the system can adjust its consistency level based on network conditions through the use of elasticity.

At this point, the intricacies of building database systems, particularly distributed ones, have been made crystal clear. As professionals tasked with evaluating and selecting DDBSes and designing solutions on top of them, having a fundamental understanding of the concepts discussed in these studies serves as a valuable foundation for informed decision-making.

Persistence Best Practices for Java Applications

By : Otavio Santana, Karina Varela

Persistence Best Practices for Java Applications

By: Otavio Santana, Karina Varela

Overview of this book

Related Content you might be interested in

Current Title:

Persistence Best Practices for Java Applications

DevOps for Databases

The Definitive Guide to Data Integration

Mastering the Java Virtual Machine

Exploring the trade-offs of distributed database systems – a look into the CAP theorem and beyond