Coordinators allow us to run interdependent Workflows as data pipelines based on some starting criteria. They decide the when part of execution of Oozie job. Most of the Oozie jobs are triggered at a given scheduled time interval or when input Dataset is present for triggering the job. Here are a few important definitions related to Coordinators:
Nominal time: This the scheduled time at which job should execute. For example, we process press release every day at 8:00 P.M.
Actual time: This is the real time when the job runs. In some cases, if the input data does not arrive, the job might start late. This type of data-dependent job triggering is indicated by the
<done-flag>
tag (more on this later). Thedone-flag
gives a signal to start the job execution.
The general skeleton template of Coordinator is shown in the following figure named Coordinator template XML:
The <parameters>
tag on line 2 in the preceding screenshot are any variables defined...