Book Image

Service Oriented Architecture: An Integration Blueprint

Book Image

Service Oriented Architecture: An Integration Blueprint

Overview of this book

Service Oriented Architecture (SOA) refers to building systems that offer applications as a set of independent services that communicate and inter-operate with each other effectively. Such applications may originate from different vendor, platform, and programming language backgrounds, making successful integration a challenging task. This book enables you to integrate application systems effectively, using the Trivadis Integration Architecture Blueprint, which is supported by real-world scenarios in which this Integration Blueprint has proved a success.This book will enable you to grasp all of the intricacies of the Trivadis Architecture Blueprint, including detailed descriptions of each layer and component. It is a detailed theoretical guide that shows you how to implement your own integration architectures in practice, using the Trivadis Integration Architecture Blueprint. The main focus is on explaining and visualizing the blueprint, including comprehensive descriptions of all of its layers and components. It also covers the more basic features of integration concepts for less experienced specialists, as well as shedding light on the future of integration technologies, such as XTP and Grid Computing. You will learn about EII and EAI, OGSi, as well as base technologies related to the implementation of solutions based on the Blueprint, such as JCA, JBI, SCA and SDO.The book begins by covering fundamental integration for those less familiar with the concepts and terminology, and then dives deep into explaining the different architecture variants and the future of integration technologies. Base technologies like JCA and SCA will be explored along the way, and the structure of the Trivadis Integration Architecture Blueprint will be described in detail, as will the intricacies of each component and layer. Other content includes discovering and comparing traditional and modern SOA driven integration solutions, implementing transaction strategies and process modeling, and getting to grips with EDA developments in SOA. Finally, the book considers how to map software from vendors like Oracle and IBM to the blueprint in order to compare the solutions, and ultimately integrate your own projects successfully.
Table of Contents (11 chapters)
Service-Oriented Architecture: An Integration Blueprint
Credits
Foreword
About the Authors
Preface
References

Patterns for data integration


Data integration is implemented using three fundamental patterns:

  • Federation

  • Population

  • Synchronization

Federation

The federation pattern is a simple data integration pattern that provides access to different data sources, and gives the calling application the impression that these sources are a single, logical data source. This is achieved as follows:

  1. 1. Expose a single consistent interface to the application.

  2. 2. Translate the interface to whatever interface is needed for the underlying data.

  3. 3. Compensate for any differences in function between the different data sources.

  4. 4. Allow data from different sources to be combined into a single result set that is returned to the user.

This is illustrated in the following diagram:

The federation pattern as shown in this diagram can be broken down into the following logical building blocks:

  • The calling applications have the need for information, but they don't possess the information.

  • The federation building block uses metadata to determine where the data required is stored, and in what format. The metadata repository allows the decomposition of a single query executed against the federation building block, into individual requests to different data sources. To the user (the calling application), the information model appears to be a single virtual repository. The data is accessed via suitable adapters for each target repository. The federation component sends an individual result to the calling application, and integrates several different formats into a shared federated schema.

  • The source applications have the information that is important for the calling applications.

The federation pattern supports structured and unstructured data, together with read-only and read/write accesses to the underlying data sources. Read/write accesses should be limited, wherever possible, to a single data source, as otherwise a two-phase commit is needed, which can be difficult in distributed databases.

Uses

Federation is used for the following purposes:

  • The data needed by an application is distributed across different databases (for historic, technical, or organizational reasons)

  • Federation is more effective than other data integration technologies, when:

    • Near real-time access is needed for rapidly changing data

    • Making a consolidated copy of the data is not possible for technical, legal, or other reasons

    • Read/write access must be possible

    • Reducing or limiting the number of copies of the data is a goal

  • It is possible to continue to make use of existing investments

Population

The population pattern has a very simple model. It gathers data from one or more data sources, processes the data in an appropriate way, and applies it to a target database. In its simplest form, the population pattern is based on the read dataset-process data-write dataset model. This corresponds to the classic ETL (Extract, Transform, and Load process.

This is illustrated in the following diagram:

The population pattern can be broken down into the following logical components:

  • The target applications have a need for information, which they do not possess. Therefore, a copy from another data source in a source application is required.

  • The population component reads one or more data sources in the source application, and writes the data to a data source in the target application. The rules for extracting data from the source application can be as complex as necessary. They range from simple rules, such as read all data, to more complex rules where only specific fields in specific records can be read under certain conditions. The loading rules for the target database can vary from a simple overwrite of the data, to a more complex process of inserting new records and updating existing ones. The metadata is used to describe these rules.

  • The source applications have the important information needed by the target applications.

Uses

Population is used for the following purposes:

  • A specialized copy of existing data (derived data) is needed:

    • Subsets of existing data sources

    • A modified version of an existing data source

    • Combinations of existing data sources

  • Only read access to the derived data in the target application is possible (or only a few write accesses).

  • In the case of a significant number of write accesses, the two-way synchronization pattern should be used.

  • The user must be provided with quick access to the information required, instead of being bombarded with too much, irrelevant, incorrect, or otherwise useless misinformation.

  • However, IT drivers often dictate the use of the population pattern. In other words, the copies of data are made for technical reasons. These drivers include:

    • Improved performance of user access

    • Load distribution across systems

Synchronization

The synchronization pattern (also known as the replication pattern) enables bidirectional update flows of data in a multi-copy database environment. The "two-way" synchronization aspect of this pattern is what distinguishes it from the "one-way" capabilities provided by the population pattern.

This is illustrated in the following diagram:

The synchronization pattern shown in this diagram can be broken down into the following logical components:

  • The target applications have a need for information, which they do not possess. Therefore, a copy from another data source in a source application is required.

  • At a simplistic level, the synchronization component can be compared to the population pattern, with the only difference being that the data flows in both directions. If the data elements flowing in both directions are fully independent, then two-way synchronization is no more than two separate instances of the population pattern. However, it is more common to find some overlap between the datasets flowing in either direction. In this case, conflict detection and resolution are needed.

  • The source applications have information which is relevant to the target applications.

Uses

Synchronization is used for the following purpose:

  • A specialized copy of existing data (derived data) is needed. This copy can take different forms:

    • Subsets of existing data sources

    • A modified version of an existing data source

    • Combinations of existing data sources

Multi-step synchronization

There is one variant of the synchronization pattern: the multi-step variant. The multi-step variant of the two-way synchronization pattern makes use of one instance of the population pattern, with its gather, process, and apply functions, for each of the two synchronization directions. An additional "reconcile" function is placed between the two data flows, and guarantees that there are no conflicts in the updates. If the opportunities for conflicts are minimal, this pattern can be constructed from existing population components. However, a specialized solution should be used for more complex situations.

The following diagram illustrates the "reuse" of the population pattern, once for each direction with the additional "reconcile" component in the middle.