What to replicate?
Another key decision in any GoldenGate implementation is what data to replicate. There is little point replicating data that doesn't need to be replicated, as this will cause unnecessary additional overhead. Furthermore, if you decide that you need to replicate everything, GoldenGate may not necessarily provide the best solution. Other products such as Oracle Active Data Guard may be more appropriate. The forthcoming paragraphs talk not only about what to replicate, but also how to replicate along with important functional and design considerations.
Object mapping and data selection
The power of GoldenGate comes into its own when you select what data you wish to replicate by using its inbuilt tools and functions. You may even wish to transform the data before it is applied to the target. There are numerous options at your disposal, but choosing the right combination is paramount.
The configuration of GoldenGate includes the mapping of source objects to target objects. Given the enormity of the parameters and functions available, it is easy to over complicate your GoldenGate Extract or Replicat process configuration through redundant operations. Try to keep your configuration as simple as possible, choosing the right parameter, option, or function for the job. Although it is possible to string these together to achieve a powerful solution, this may cause significant additional processing and the performance will suffer as a result.
GoldenGate provides the ability to select or filter out data based on a variety of levels and conditions. Typical data mapping and selection parameters are:
TABLE
/MAP
: Specifies the source and target objects to replicate.TABLE
is used in Extract andMAP
in Replicat parameter files.WHERE
: Similar to the SQLWHERE
clause, theWHERE
option included with aTABLE
orMAP
parameter enables basic data filtering.FILTER
: Provides complex data filtering. TheFILTER
option can be used with aTABLE
orMAP
parameter.COLS
/COLSEXCEPT
: TheCOLS
andCOLSEXCEPT
option allows columns to be mapped or excluded when included with aTABLE
orMAP
parameter.
Before GoldenGate can extract data from the database's transaction logs, the relevant data needs to be included in its redo log files. For the Oracle source database, a number of prerequisites exist to ensure that the changed data can be replicated.
- Enable supplemental logging:
- Set the
FORCE LOGGING
feature at database level to override anyNOLOGGING
operation, which ensures all changed data is written to the redo logs. - To force the logging of the full before and after image. The before and after images store the state of the data before and after an
UPDATE
transaction, which are written to the database's transaction logs.
- Set the
- Ensure each source and target table has a primary key:
- GoldenGate requires a primary key to uniquely identify a row.
- If the primary key does not exist on the source table, GoldenGate will create its own unique identifier by concatenating all the table columns together. This can be grossly inefficient given the volume of data that needs to be extracted from the redo logs. Ideally, only the primary key plus the changed data (before and after images in the case of an update statement) are required.
- If the primary key does not exist on the target table, you may receive the following warning in the GoldenGate error log:
WARNING OGG-00869 No unique key is defined for table 'TARGET_TABLE_NAME'. All viable columns will be used to represent the key, but may not guarantee uniqueness. KEYCOLS may be used to define the key
. - It is also advisable to have a primary key defined on your target table(s) to ensure fast data lookup when the Replicat recreates and applies the DML statements against the target database. This is particularly important for the
UPDATE
andDELETE
operations.
Initial load
The initial load is the process of instantiating the objects on the source database, synchronizing the target database objects with the source and providing the starting point for data replication. The process enables change synchronization, which keeps track of the ongoing transactional changes while the load is being applied. This allows users to continue to change data on the source during the initial load process.
The initial load can be successfully conducted using:
- A database load utility such as import/export or Oracle data pump.
- An Extract process to write data to files in ASCII format. Replicat then applies the files to the target tables.
- An Extract process to write data to files in ASCII format. SQL*Loader (direct load) can be used to load the data into the target tables.
- An Extract process that communicates directly with the Replicat process across a TCP/IP network without using a collector process or files.
If change synchronization is not required during the initial load, then the following best practices should be adopted:
- Data: Make certain that the target tables are empty to avoid duplicate row errors or conflicts between existing rows and rows that are being loaded.
- Constraints: Disable foreign key constraints and check constraints. Foreign key constraints can cause errors and Check constraints can slow down the loading process. Reactivate the constraints after the load completes successfully.
- Indexes: Remove indexes from the target tables (apart from primary keys). Indexes are not necessary for inserts and slow down the loading process. For each row that is inserted into a table, the database will update every index on this table. Recreate the indexes after the load completes.
CSN coordination
An Oracle database uses the System Change Number (SCN) to keep track of transactions. For every commit, a new SCN is assigned. The data changes, including primary key and SCN, are written to the database's online redo logs. Oracle requires these logs for crash recovery, which allows the committed transactions to be recovered (uncommitted transactions are rolled back). GoldenGate leverages this mechanism by reading the online redo logs, extracting the data, and storing the SCN as a series of bytes. The Replicat process replays the data in SCN order while applying data changes to the target database. The Oracle GoldenGate manuals refer to the SCN as a CSN (Commit Sequence Number).
Tip
As a prerequisite to enable GoldenGate in your environment, the Oracle source database must be in the Archivelog mode to allow the mining of its archived redo logs, following a fallback or outage in replication.
Trail file format
Oracle GoldenGate's trail files are in a canonical format. Backed by checkpoint files for persistence, they store the changed data in a hierarchical form, including metadata definitions. The GoldenGate software includes a comprehensive utility named Logdump that has a number of commands to search and view the internal file format.
Tip
Oracle database redo log and GoldenGate trail file formats differ between versions. A trail or Extract file must have a version that is equal to, or lower than, that of the process that reads it.