Book Image

Oracle 11g Streams Implementer's Guide

Book Image

Oracle 11g Streams Implementer's Guide

Overview of this book

From smaller businesses through to huge enterprises ñ users all over the world often require access to data 24 hours a day. Distributed database systems proliferate the world of data sharing providing an ability to access real-time data anywhere, anytime. Oracle Streams, a built-in feature of the Oracle database, is a data replication and integration feature critical to the success and wellbeing of enterprises in today's fast moving economy. This book provides the reader with solid techniques to master Oracle Streams technology and successfully deploy distributed database systems. This book quickly goes over the basics and gets you up and running with a simple Oracle 11g Streams environment. It will serve as an excellent companion to the Oracle Streams Administration Guide. It is intended for Oracle database architects and administrators, and provides in-depth discussion on must-know information for the design, implementation, and maintenance of an Oracle Streams environment. The book does not attempt to regurgitate all the information in the Oracle Streams Administration Guides, but rather provides additional clarification and explanation of design, implementation, and troubleshooting concepts that are often elusive in Streams documentation. It also identifies helpful tools and Oracle resources to add to your knowledge base, as well as tried and tested tricks and tips to help you tame Oracle Streams. The book starts by introducing and explaining the components of Oracle Streams and how they work together. It then moves on logically, helping you to determine your distributed environment requirements and design your Streams implementation to meet those requirements. Once these concepts are discussed, the book moves to configuration and basic implementation examples to help solidify those concepts. It then addresses advanced features such as tags, down-stream capture, and conflict resolution. You then move on to maintenance techniques such as documenting the environment, effectively planning and implementing changes to the environment, and monitoring and troubleshooting the environment. When you have studied the techniques and completed the hands-on examples, you will have an understanding of Oracle Streams' core concepts and functionally that will allow you to successfully design, implement, and maintain an Oracle Streamed environment.
Table of Contents (14 chapters)
Oracle 11g Streams Implementer's Guide
Credits
About the Authors
About the Reviewers
Preface

Streams architecture overview


Let's take a moment to briefly run through the replication topologies and process flow, and identify the Oracle functional components that are used by Streams.

Topology configurations

Distributed topology configurations are as limited as rocks in a river. However, many are not conducive to an efficient and maintainable system. The number one consideration when choosing a distributed topology is data ownership and conflict resolution requirements as discussed in Chapter 2, Plot Your Course: Design Considerations. To keep your Streams from becoming wild, untamed rivers, and drowning the DBA's, keep master sites to a minimum, if at all possible, and data flows in a tightly controlled and synchronized manner. Oracle recommends no more than four masters involved in full-on all-site N-way replication, and the authors second that recommendation with gusto.

In the Preface, we briefly described single-source, multiple-source, hybrid, and heterogeneous configurations.

The following images provide a synopsis of succinct, controlled single-source and multiple-source configuration examples. Of course these are not the only viable configurations, but they will help you to start developing a feel of how to control your Streams.

Keep in mind that the relationship between source and target (secondary) databases assumes they share the same data at some level. Two databases that handle totally different data would not be considered a source or secondary site to each other.

Single source

In a single-source configuration there is only one database that is the source of data changes that are being Streamed to other database site(s). At the other site(s)/target(s), the data can be applied and/or forwarded to another database. If data changes are forwarded from a destination database, the configuration is called a directed network. There are two types of directed network forwarding configurations; Queue forwarding and Apply forwarding. Queue forwarding involves propagating the change to a target database site but not applying the change at the database. Rather, the destination only forwards the change to a subsequent site to be applied down the line. Apply forwarding will apply the change, and forward it to subsequent destinations via local Capture and Propagation. Destination sites configured as Queue or Apply forwarding sites are often referred to as intermediate sites. Another single source configuration consists of a copy of the actual redo logs being forwarded from the source database, to a "downstream" target database. The actual Capture process and queue are configured on the downstream database rather than on the source database. This configuration is called Downstream Capture which is explained in more detail later on this chapter and in Chapter 6, Get Fancy with Streams Advanced Configurations.

In a single source environment, steps should be taken to avoid changes being made at the secondary destination databases to avoid data divergence and change conflicts.

Some illustrated examples of single-source configurations with a brief explanation of where the Streams processes are located are shown as follows.

Single database

A single database configuration hosts both the Capture and Apply processes on the same database. This can negate the need for a Propagation process as the Apply process can be created to dequeue from the same buffered queue into which the Capture process enqueues. However, there may be circumstances where you wish to configure separate buffered capture queue and persistent apply queues. In this case you would also configure a Propagation process between the two queues. The Apply process can be assigned an apply handler that processes the LCRs in a specific manner. This type of configuration can be used to support client application access to captured LCR data and heterogeneous configurations.

Uni-directional

In single-source to single-target configuration the Capture and Propagate processes and the Capture queue are located at the Source database. The Apply process and queue resides on the Target.

Cascaded/directed network

In a directed network configuration, Capture and Propagation processes and Capture queue reside on the Source. For Queue forwarding, the forwarding site has a queue, but no Apply process. For Apply forwarding, the forwarding site is configured with an Apply process and queue as well as a local Capture process and queue. Tags (discussed in Chapter 6) are used to coordinate the local "recapture" of the applied changes. Appropriate Propagation processes are configured from the forwarding site Capture queue to the destination sites. The final destination site is configured with a regular Apply process and queue.

Hub-and-Spoke

In single-source Hub-and-Spoke configuration, data is Streamed from one source to multiple destinations (targets). This is often seen in "Headquarter to Branch (or Regional)" configuration. With this type of configuration, there is a Capture process and queue at the source as well as a Propagation process to each destination. An Apply process and queue are configured on each of the destinations.

Multiple source

In a multiple source Streams configuration, shared data can be changed at two or more sites. A situation can arise where there is conflict caused by DML/DDL originating from one or more databases acting on the exact same data at nearly the same time. To overcome this conflict situation, conflict resolution must be implemented to determine which data changes should be kept and which should be discarded.

Below are some illustrated examples of multiple-source configurations.

Bi-directional/N-way/Master-to-Master

Bi-directional, N-way, Master-to-Master are all names for essentially the same thing. This configuration allows data changes to be made at all master sites and replicated to all other master sites. As such, Capture, Propagation, and Apply processes and queues must be configured at each master site. For the Capture processes at each site, you can choose to configure a single Capture process and queue for all Propagation processes, or a separate Capture process and queue for each Propagation process.

Uni-directional Spokes-to-Hub

In this configuration, the SPOKES are the origination point of the data change and the HUB is the destination. Capture and Propagation processes and Capture queue are configured on each of the SPOKE sights. An Apply process and queue for each SPOKE site is configured on the HUB. Conflict resolution should be configured at the HUB to accommodate potential data change conflicts from multiple sources.

Bi-directional Spoke-to-Hub

This configuration is just an extension of uni-directional Spoke-to-Hub that allows the HUB to send its changes to each spoke. This means that at least one Capture process and queue must be configured on the HUB, and a Propagation process configured to each SPOKE. Note here that the HUB processes should be configured so that the HUB does not send the same change back to the SPOKE that originated it. This can be accomplished in a number of ways by using tags, and/or conditions (covered in Chapter 6). In addition, an Apply process and queue must be configured at each SPOKE to receive and process changes from the HUB, as well as the Capture process and queue and Propagation process to the HUB that we use in uni-directional Spoke-to-Hub.

Hybrid

A Hybrid configuration is simply a combination of single and/or multiple- source configurations. For instance, one leg of your topology could be a directed network, while another leg could be a master-to-master. The trick is all in where you put your Capture, Propagate, and Apply processes.

Heterogeneous

Heterogeneous configurations include a source or target database that is not an Oracle database. Oracle Heterogeneous Gateways can be used to allow an Oracle database to send and receive changes to and from these non-Oracle sources. The gateways provide a "translation" level that converts Oracle SQL syntax to and from non-Oracle SQL syntax. This allows the users to transparently accomplish equivalent SQL operations from an Oracle database to a non-Oracle database. Oracle 11gR2 delivers the following Gateways to non-Oracle database platforms:

  • Adabas

  • APPC

  • DRDA

  • IMS

  • Informix

  • ODBC

  • SQL Server

  • Sybase

  • Teradata

  • VSAM

In a heterogeneous environment, the Apply process and queue must still exist in an Oracle database and be configured to use a database link to the non-Oracle database. The source database may or may not be an Oracle database. It could be such that the Oracle database is merely an intermediate database that is a directed network where a client application enqueues LCR constructed from data at a non-Oracle database into the Apply queue. Regardless of how the LCR is enqueued in the Apply queue, the Apply process on the Oracle database uses Heterogeneous Services and Transparent Gateway to apply LCR changes directly to database objects in a non-Oracle database through the database link to the non-Oracle database. In other words, the Apply process unpacks the LCR and constructs the necessary DML statement on the Oracle side. It then executes the statement through the gateway database link, using Heterogeneous services to translate the SQL to the proper non-Oracle SQL syntax. Capture and Propagation are configured at the source database. If the Apply process for the non-Oracle database is configured on the source database, Propagation between the Capture and Apply would not be required. A remote Oracle destination database can also be configured to apply the change to a non-Oracle database.

As mentioned above, data can also be sent to an Oracle database from a non-Oracle source. This can be done with Advanced queuing and XSTREAMS or JMS. Again the Apply queue and process are on the Oracle database. The non-Oracle database interface must be configured to enqueue a message to the apply queue on the Oracle database. That message is then dequeued and processed by an Oracle advanced queue process.

Important: Apply processes for a non-Oracle database can only apply DML, not DDL.

Note

For more information on Streams Configurations, please reference the Oracle Streams Extended Examples manual.

Simultaneous versus Synchronous replication

Synchronous replication in a distributed environment means that a change must be successfully committed at the source and all destination sites, or it is not committed at any site, including the source site.

As mentioned in the Preface, Streams is, by nature, an asynchronous replication. The pure fact that the change must be committed at the source site before it is even sent to other sites, means Streams is not a synchronous method of replication.

Even if you use Synchronous Capture, it is still not synchronous replication. The "synchronous" in Synchronous Capture refers to the enqueueing of the change to the Capture queue when it is committed to the source data, rather than having LogMiner mine the redo logs, find the change, and enqueue it. This does not mean that the change is successfully committed to the intended destination database data.

Can Streams be simultaneous (or near-simultaneous depending on data transfer and processing rates)? Yes, the Synchronous Capture, and the combined Capture and Apply (new in 11g and discussed later in this chapter) support simultaneous replication (though they cannot be used together). They reduce the mining, enqueueing, and dequeueing work required by normal implicit Streams. Just remember; we may be able to get the change to the other sites very quickly, but we cannot guarantee 100 percent that the change will be committed at the destination.

The moral of the story is: Streams replication, as it is today, can be "simultaneous", but it can never be "synchronous".

Oracle's Streams replication process flow

  • A change is captured from a database redo stream via LogMiner, or simultaneous Capture mechanisms

  • Any defined capture rules/transformations are applied to the change

  • The Captured Change is molded into a Logical Change Record (LCR)

  • The LCR is stored as a message in a specialized advanced queue to be sent to the target site

  • The propagation job for the target site consumes the message, applies any defined propagation rules/transformations to the message, and sends it to a specialized advanced queue at the target site

  • Oracle's advanced queuing guaranteed, fail-safe Propagation protocol ensures receipt of the message and coordinates the success/error result and handling of the advanced queue messaging

  • The Apply process at the target site consumes the message from the advanced queue

  • Any defined Apply rules/transformations are applied to the change

  • The Apply process then attempts to apply the change to the target site

  • All LCR transactions are validated at the target database by conflict detection to ensure the data is consistent between the source and target databases prior to applying the change

  • When data inconsistencies are found by conflict detection:

    • If conflict resolution is defined, it is applied to the LCR data inconsistency

    • If conflict resolution is not defined, or fails to resolve the LCR data inconsistency, the LCR is not applied at the target but retained in the erred transaction queue for manual resolution

Streams components

The following Oracle components are used to support the Streams process flow:

  • Log Miner: Captures the changes at the originating site.

  • Advanced Queuing: Used to support transporting changes between sites.

  • Capture, Propagate, Apply database processes: Persistent database processes that accomplish the Capture, Propagation, and Apply tasks.

  • Capture, Propagate, Apply rules/transformation via PL/SQL: PL/SQL blocks that define how data should be manipulated by the various processes.

  • Logical change record types: Specialized record types used by Streams to store and manage database change message payloads.

  • Database links/Oracle Net: Provides an operating system independent connectivity between database sites.

  • User authentication/authorization: Provides security access at the database connection and object levels.

  • Guaranteed fail-safe propagation protocol: This ensures that a message is successfully delivered and enqueued at the destination site. If an error occurs, the propagation schedule is marked with an error at the originating site for manual resolution and the message is retained in the Capture queue until it can be propagated.

  • Conflict detection: Internal protocol that determines if the record to which the change is to be applied matches the record at the originating site before the change was made. This supports data change synchronization.

  • Conflict resolution via PL/SQL: Supplied or user defined PL/SQL blocks used to resolve data conflicts found via conflict detection.