The Oracle GoldenGate topology is a representation of the databases in a GoldenGate environment, the GoldenGate components configured on each server, and the flow of data between these components.
The flow of data in separate trails is read, written, validated, and check-pointed at each stage. GoldenGate is written in the C computer programming language and because it is native to the operating system, it can run extremely fast. The sending, receiving, and validation have very little impact on the overall machine performance. Should the performance become an issue due to the sheer volumes of data being replicated, you may consider configuring parallel Extract and/or Replicat processes.
The following sections describe the process topology; firstly, discussing the rules that you must adhere to when implementing GoldenGate, followed by the order in which the processes must execute for end-to-end data replication.
While using parallel Extract and/or Replicat processes, ensure you keep related DDL and DML together in the same process group to ensure data integrity. The topology rules to configure the processes are as follows:
All objects that are relational to an object are processed by the same group as the parent object
All DDL and DML for any given database object are processed by the same Extract group and Replicat group
Should a referential constraint exist between tables, the child table with the foreign key must be included in the same Extract and Replicat group as the parent table having the primary key.
The following tables and associated diagrams help to describe the GoldenGate replication dataflow and position of each link in the process topology for the following two configuration options:
CDC and data delivery with a data pump
CDC and data delivery without a data pump
The following diagram illustrates the dataflow for the CDC and data delivery that includes a data pump process:
The following table describes the position of each process in the dataflow.
Start component |
End component |
Position |
---|---|---|
Extract process |
Local trail file |
1 |
Local trail file |
Data pump |
2 |
Data pump |
Server collector |
3 |
Server collector |
Remote trail file |
4 |
Remote trail file |
Replicat process |
5 |
The following diagram illustrates the dataflow for the CDC and data delivery. Here the Extract process communicates directly with the server collector.
The following table describes the position of each process in the dataflow.
Start component |
End component |
Position |
---|---|---|
Extract process |
Server collector |
1 |
Server collector |
Remote trail file |
2 |
Remote trail file |
Replicat process |
3 |
The former is the preferred topology, which includes a data pump to enable the safeguard of additional check-pointing in the process dataflow.
In terms of performance monitoring, the GGSCI tool provides real-time statistics as well as comprehensive reports for each process configured in the GoldenGate topology. In addition to reporting on demand, it is also possible to schedule reports to be run. This can be particularly useful while performance tuning a process for a given load and period.
The INFO ALL
command provides a comprehensive overview of the process status and lag, whereas the STATS
option shows more detail. Both commands offer real-time reporting. The following example shows the statistical summary of the available information: