Book Image

Oracle 10g/11g Data and Database Management Utilities

Book Image

Oracle 10g/11g Data and Database Management Utilities

Overview of this book

Does your database look complicated? Are you finding it difficult to interact with it? Database interaction is a part of the daily routine for all database professionals. Using Oracle Utilities the user can benefit from improved maintenance windows, optimized backups, faster data transfers, and more reliable security and in general can do more with the same time and resources.
Table of Contents (18 chapters)
Oracle 10g/11g Data and Database Management Utilities
Credits
About the Author
About the Reviewer
Preface

Data Pump export


Data Pump export (expdp) is the database utility used to export data, it generates a file in a proprietary format. The generated file format is not compatible with the one generated by the old export (exp) utility.

Data Pump export modes

Data Pump export modes define the different operations that are performed with Data Pump. The mode is specified on the command line using the appropriate parameters. Data Pump has the following modes:

  • Full export mode: This mode exports the whole database; this requires the user to have the exp_full_database role granted.

  • Schema mode: This mode selects only specific schemas, all objects belonging to the listed schemas are exported. When using this mode you should be careful, if you direct a table to be exported and there are objects such as triggers which were defined using a different schema, and this schema is not explicitly selected then the objects belonging to this schema are not exported.

  • Table mode: The tables listed here are exported, the list of tables, partitions, and dependent objects are exported. You may export tables belonging to different schemas, but if this is the case then you must have the exp_full_database role explicitly granted to be able to export tables belonging to different schemas.

  • Tablespace mode: This mode allows you to export all tables belonging to the defined tablespace set. The tables along with the dependent objects are dumped. You must have the exp_full_database role granted to be able to use this mode.

  • Transportable tablespace mode: This mode is used to transport a tablespace to another database, this mode exports only the metadata of the objects belonging to the target set of listed tablespaces. Unlike tablespace mode, transportable tablespace mode requires that the specified tables be completely self-contained.

Data Pump Export provides different working interfaces such as:

  • Command line interface: The command line interface is the default and the most commonly used interface. Here the user must provide all required parameters from the OS command line.

  • Parameter file interface: In this mode the parameters are written in a plain text file. The user must specify the parameter file to be used for the session.

  • Interactive command line interface. This is not the same interactive command line most users know from the regular exp command. This interactive command line is used to manage and configure the running jobs.

A simple Data Pump export session

Now let's start with our first simple Data Pump export session. This will show us some initial and important features of this tool.

Here we will start with a basic Data Pump session to perform a simple logical backup. The command expdp has the following arguments:

Initially we define a Dumpfile(A) . As it will be managed by means of a database directory object(B) it is not necessary to define the path where the dump file will be stored. Remember, the directory objects were previously defined at the database level. This session will export a user's Schema(C) . No other parameters are defined at the command prompt and the session begins.

It can be seen from the command output that an estimation(D) takes place; this estimates the size of the file at the file system, and as no other option for the estimation was defined at the command line it is assumed the BLOCKS method will be used. The estimation by means of the BLOCKS method isn't always accurate, as it depends on the blocks sampled. Block density is a meaningful error factor for this estimation, it is better to use STATISTICS as the estimation method.

At the output log, the Master table(F) where the job running information is temporarily stored can be seen. The job name takes a default name(E) . It is a good practice to define the job name and not let Oracle define it at execution time, if a DBA names the Job, it will be easier to reference it at a later time.

Data Pump export filtering operations

At the command line, filtering options can be specified. In this example, it is used to define the tables to export, but we can also specify whether the dump file will (or will not) include all other dependent objects.

The include (A) and exclude options are mutually exclusive, and in this case as include was declared at the command line and it requires special characters, those must be escaped so the OS doesn't try to interpret them. When a longer include or exclude option is required, it is better to use a parameter file, where the escape characters are not required.

All the filtered objects (C) to be exported were saved in the dump file along with their dependent objects (B). If you change the command line with the following, it will prevent all the indexes being exported:

$ expdp hr/hr dumpfile=default_dp_dest:EmpDeptNoIndexes tables=EMPLOYEES,DEPARTMENTS exclude=INDEX:\"LIKE \'\%\'\" logfile=default_log_dest:EmpDeptNoIndexes

As can be seen, the exclude or include clause is actually a where predicate.

Use of parameter file

Using a parameter file simplifies an otherwise complex to write command line, it also allows the user to define a library of repeatable operations, even for simple exports. As previously seen, if a filtering (object or row) clause is used—some extra operating system escape characters are required. By writing the filtering clauses inside a parameter file, the command line can be greatly simplified.

Comparing this command line (A) against the previously exposed command lines, it can be seen that it is more readable and manageable. The SHParFile.dpp file from the example contains these command options:

USERID=sh/sh
DUMPFILE=shSales
DIRECTORY=default_dp_dest
JOB_NAME=shSalesCompleteDump
TABLES=SALES
LOGFILE=default_log_dest:shSales

The parameter file is a plain text format file. You may use your own naming conventions. Oracle regularly uses .par for the parameter files, in this case it used .dpp to denote a Data Pump parameter file. The file name can be dynamically defined using environment variables, but this file name formatting is beyond the scope of Oracle and it exclusively depends on the OS variable management.

JOBNAME (C) is the option to specify a non-default job name, otherwise oracle will use a name for it. It is good practice to have the job name explicitly defined so the user can ATTACH to it at a later time, and related objects such as the Master table (B) can be more easily identified.

Retrieve original data

In some circumstances, it may be useful to export the image of a table the way it existed before a change was committed. If the database is properly configured, the database flashback query facility—also integrated with dpexp—may be used. It is useful for obtaining a consistent exported table image.

In this example a copy of the original HR.EMPLOYEES table is made (HR.BAK_EMPLOYEES), and all the tasks will update the BAK_EMPLOYEES table contents. A Restore Point is created so that you can easily find out the exact time stamp when this change took place:

SQL> CREATE RESTORE POINT ORIGINAL_EMPLOYEES;
Restore point created.
SQL> SELECT SCN, NAME FROM V$RESTORE_POINT;
       SCN NAME
---------- --------------------------------
    621254 ORIGINAL_EMPLOYEES
SQL> SELECT SUM(SALARY) FROM EMPLOYEES;
SUM(SALARY)
-----------
     691400

This is the way data was, at the referred SCN. This number will be used later, to perform the expdp operation and retrieve data as it was, at this point in time.

Next a non-reversible update on the data takes place.

SQL> UPDATE BAK_EMPLOYEES SET SALARY=SALARY*1.1;
107 rows updated.
SQL> COMMIT;
Commit complete.
SQL> SELECT SUM(SALARY) FROM BAK_EMPLOYEES
SUM(SALARY)
-----------
    760540

Here we have a time reference and the goal is to restore data as it was.

Below are the contents of the data pump parameter file used to retrieve data.

USERID=hr/hr
DIRECTORY=default_dp_dest
DUMPFILE=hrExpAtRestorePoint
JOB_NAME=hrExpAtRestorePoint
TABLES=BAK_EMPLOYEES
LOGFILE=default_log_dest:hrExpAtRestorePoint
FLASHBACK_SCN=621254

The parameter FLASHBACK_SCN states the point in time from when the table is to be retrieved.

Once the backup is taken, the current table is dropped. When the import takes place it rebuilds the table with the data, as it was before. The import parameter file has been temporarily modified so it defines the log file name, and it includes only the minimum required parameters for the impdp task (C).

USERID=hr/hr
DIRECTORY=default_dp_dest
DUMPFILE=hrExpAtRestorePoint
JOB_NAME=ImpAtRestorePoint
TABLES=BAK_EMPLOYEES
LOGFILE=default_log_dest:hrImpAtRestorePoint

Once the import job is finished, a query to the current table shows the data 'as it was', prior to the update command.

SQL> select sum(salary) from bak_employees;
SUM(SALARY)
-----------
  691400

Data Pump export space estimation

Proactively estimating the amount of space required by an export file prevents physical disk space shortages. Data Pump has two methods to estimate the space requirements: Estimation by block sampling (BLOCKS) or estimation by object statistics (STATISTICS).

ESTIMATE={BLOCKS | STATISTICS}
  • BLOCKS—The estimate is calculated by multiplying the number of database blocks used by the target objects times the appropriate block sizes.

  • STATISTICS—The estimate is calculated using statistics for each table. For this method to be as accurate as possible, all tables should have been analyzed recently.

The second method leads to more accurate results and can be performed in a more efficient way than the BLOCKS method; this method requires reliable table statistics.

It can be seen from an export execution, that space estimation is always carried out, and the default estimation method is BLOCKS. The BLOCKS method is used by default as data blocks will always be present at the table, while the presence of reliable statistics cannot be taken for granted. From performance and accuracy perspectives it is not the best choice. It takes longer to read through the whole table, scanning the data block to estimate the space required by the dump file. This method may not be accurate as it depends on the block data distribution. This means that it assumes all block data is evenly distributed throughout all the blocks, which may not be true in every case, leading to inaccurate results. If the STATISTICS keyword is used, it is faster; it only has to estimate the file size from the information already gathered by the statistics analysis processes.

Taking the export of the SH schema with the ESTIMATE_ONLY option and the option BLOCKS, the estimation may not be as accurate as the STATISTICS method. As these test results shows:

ESTIMATE_ONLY

Reported Estimated Dump File Size

BLOCKS

15.12 MB

STATISTICS

25.52 MB

ACTUAL FILE SIZE

29.98 MB

From the above results, it can be seen how important it is to have reliable statistics at the database tables, so any estimation performed by data pump can be as accurate as possible.

Dump file multiplexing

Data Pump export is an exporting method that is faster than the old exp utility. Export speed can between 15 and 45 times faster than the conventional export utility. This is because the original export utility uses only conventional mode inserts, whereas Data Pump export uses the direct path method of loading, but in order for it to reach the maximum possible speed it is important to perform the parallel operations on spindles other than those where the database is located. There should be enough I/O bandwidth for the export operation to take advantage of the dump file multiplexing feature.

The options used to generate an export dump in parallel with multiplexed dump files are:

USERID=sh/sh
DUMPFILE=shParallelExp01%u,shParallelExp02%u
DIRECTORY=default_dp_dest
JOB_NAME=shParallelExp
TABLES=SALES
LOGFILE=default_log_dest:shParallelExp
ESTIMATE=statistics
PARALLEL=4

Notice the %u flag, which will append a two digit suffix to the Data Pump file. These options will direct export data pump to generate four dump files which will be accessed in a round robin fashion, so they get uniformly filled.

The resulting export dump files are:

shParallelExp0101.dmp
shParallelExp0102.dmp
shParallelExp0201.dmp
shParallelExp0202.dmp

Transporting data among different versions

Data Pump allows data transfers among different Oracle versions that support the feature. (Note the feature was introduced in Oracle Database 10g Release. 1). The database must be configured for compatibility of 9.2.0 or higher. This feature simplifies data transfer tasks. In order for this to work it is important to consider the source version versus the destination version. It works in an ascending compatible mode, so a Data Pump export taken from a lower release can always be read by the higher release, but an export taken from a higher release must be taken with the VERSION parameter declaring the compatibility mode. This parameter can either take the value of COMPATIBLE (default) which equals the compatible instance parameter value, LATEST, which equals the metadata version or any valid database version greater than 9.2.0. This last statement doesn't mean Data Pump can be imported on a 9.2.0 database. Rather, it stands for the recently migrated 10g databases which still hold the compatible instance parameter value set to 9.2.0.

If the COMPATIBLE parameter is not declared an export taken from a higher release won't be read by a lower release and a run time error will be displayed.

When performing data transfers among different database versions, you should be aware of the Data Pump compatibility matrix:

Data Pump client and server compatibility:

expdp and impdp client version

10.1.0.X

10.2.0.X

11.1.0.X

10.1.0.X

Supported

Supported

Supported

10.2.0.X

NO

Supported

Supported

11.1.0.X

NO

NO

Supported

Each Oracle version produces a different Data Pump file version, when performing expdp/impdp operations using different Data Pump file versions you should be aware of the file version compatibility.

Version Data Pump Dumpfile Set

Written by database with compatibility

Can be imported into Target

10.1.0.X

10.2.0.X

11.1.0.X

0.1

10.1.X

Supported

Supported

Supported

1.1

10.2.X

No

Supported

Supported

2.1

11.1.X

No

No

Supported

Data Pump export interactive mode

Data Pump is meant to work as a batch utility, but it also has a prompt mode, which is known as the interactive mode. It should be emphasized that the data pump interactive mode is conceptually different from the old interactive export/import mode. In this release, the interactive mode doesn't interfere with the currently running job, it is used to control some parameter of the running job, such as the degree of parallelism, kill the running job, or resume job execution in case of a temporary stop due to lack of disk space.

In order for the user to ATTACH to a running job in interactive mode, the user must issue the Ctrl-C keystroke sequence from an attached client. If the user is running on a terminal different from the one where the job is running, it is still possible to attach to the running job by means of the explicit ATTACH parameter. It is because of this feature that it is useful to not let Oracle define the job name.

Once attached there are several commands that can be issued from the open Data Pump prompt:

Command

Description (Default)

CONTINUE_CLIENT

Return to logging mode. Job will be re-started if idle

EXIT_CLIENT

Quit client session and leave the job running

HELP

Summarize interactive commands

KILL_JOB

Detach and delete job

PARALLEL

Change the number of active workers for current job PARALLEL=

START_JOB

Start/resume current job. START_JOB=SKIP_CURRENT will start the job after skipping any action which was in progress when job was stopped

STATUS

Frequency (seconds) job status is to be monitored where the default (0) will show new status when available STATUS=[interval]

STOP_JOB

Orderly shutdown of job execution and exits the client. STOP_JOB=IMMEDIATE performs an immediate shutdown of the Data Pump job

In this scenario the expdp Command Line Interface (CLI) is accessed to manage a running job. First a simple session is started using the command:

expdp system/oracle dumpfile=alphaFull directory=default_dp_dest full=y job_name=alphaFull

The JOB_NAME parameter provides a means to quickly identify the running job.

Once the job is running on a second OS session a new expdp command instance is started, this time using the ATTACH command. This will open a prompt that will allow the user to manage the running job.

expdp system/oracle attach=alphaFull

After showing the job status it enters the prompt mode where the user can issue the previously listed commands.

In this case a STOP_JOB command has been issued. This notifies the running session that the command execution has been stopped, the job output is stopped and the OS prompt is displayed. After a while the user reattaches to the running job, this time the START_JOB command is issued, this resumes the job activity, but as the expdp session was exited no more command output is displayed. The only way the user can realize the job is running is by querying the DBA_DATAPUMP_JOBS view or by browsing the log file contents.

Note

The ATTACH command does not require the job name if there is only a single JOB running. If there is more than one concurrent job running then the user must specify the job name.