Characteristics of Java persistence frameworks
Let’s grasp the idea of the differences between the Java language and the multiple databases available. Java, an Object-Oriented Programming (OOP) language, naturally offers features such as inheritance, encapsulation, and types, which supports the creation of well-designed code. Unfortunately, not all of these features are supported by database systems.
As a consequence, when integrating both language and database paradigms, some of their unique advantages might get lost. This complexity becomes clear when we observe that in all data manipulation between in-memory objects and the database schema, there should be some data mapping and conversion. It is critical to either define a preferred approach or provide an isolation layer. In Java, the most systematic way to integrate both worlds is through the usage of frameworks. Frameworks come in various types and categories shaped by their communication levels and the provided API dynamics. In Figure 1.1, observe the key aspects of both concepts:
Figure 1.1 – Considerations about the different characteristics of a Java persistence framework
- Communication levels: Define how unrelated the code is from either the database or OOP paradigm. The code can be designed to be more similar to one of the two domains. To clarify, take into consideration two common approaches for integrating a Java app with a database – using a database driver directly or relying on the mapper pattern:
- Directly adopting a driver (e.g., JDBC Driver) means working closer to the database domain space. A database driver that is easy to work with is usually data-oriented. A downside is the need to have more boilerplate code to be able to map and convert all manipulated data between the database model and the Java domain objects.
- The mapper pattern provides the possibility to map a database structure to the Java objects using the completely opposite approach. In the context of mapping frameworks such as Hibernate and Panache, the primary objective is to align more closely with the OOP paradigm rather than focusing primarily on the database. While offering the benefit of reduced boilerplate code, it has as a trade-off, to coexist with a constant object-relational impedance mismatch and its consequent performance impacts. This topic will be covered in more detail in further chapters.
- API abstraction levels: To abstract some level of translation between Java and the database during data manipulation and other database interactions, developers rely on a given Java API. To clarify the abstraction level of an API, you can ask, for example, “How many different database types does a given database API support?” When using SQL as a standard for relational database integration, developers can use a single API and integrate it with all relational database flavors. There are two types of APIs:
- A specific API may offer more accurate updates from the vendor, but it also means that any solution that relies on that API will need to be changed if you ever want to switch to a different database (e.g., Morphia or Neo4j-OGM – OGM stands for Object Graph Mapper)
- An agnostic API is more flexible and can be used with many different types of databases, but it can be more challenging to manage updates or particular behaviors for each one
Code design– DDD versus data-oriented
In the renowned book Clean Code, the author, known as Uncle Bob, states OOP languages have the benefit of hiding data in order to expose its behavior. In the same line of thought, we see DDD, which proposes the usage of a ubiquitous language throughout the domain’s code and related communication. The implementation of such a proposal can be achieved through the usage of OOP concepts. In Data-Oriented Programming, Yehonathan Sharvit suggests simplifying complexity by giving relevance to data and treating it as a “first-class citizen.”
Luckily, there are several frameworks to assist us in the challenges of delivering performant persistence layers. Although we understand that more options bring back the paradox of choice, there’s no need to worry – this book is a helpful resource that software engineers can use to learn how to evaluate multiple perspectives within software architecture, especially the details within the data storage integration and data manipulation space.
So far, we have explored the diverse methods that we humans have devised to address a fundamental issue: efficiently storing data in a manner that ensures longevity and serves as a knowledge base to support our evolution. As technology has advanced, multiple persistence strategies have been made available to software architects and developers, including relational and unstructured approaches such as NoSQL. The variety of persistence options has resulted in new challenges in software design; after all, retrieving, storing, and making data available also went through innovation at the application layer. Persistence frameworks, since then and still today, provide architects with different strategies, enabling designs where development is closely associated with the underlying database technology or is more dynamic and agnostic.
Our next stop on this database historical journey is the cloud era. Let’s explore how cloud offerings have impacted applications and the ways and locations where data can now be stored.