Index
A
- action nodes / Action nodes
B
- Beeline and Hive Server 2
- working, URL / Hive 2 action
- Bundle rerun
- URL / Rerun Bundle
- bundles
- defining / Bundles
C
- case study, Hive Jobs
- defining / Chapter case study
- case study, MapReduce Jobs
- defining / Chapter case study
- case study, Oozie
- defining / Book case study, Chapter case study
- case study, Pig Jobs
- defining / Chapter case study
- case study, Sqoop Jobs
- defining / Chapter case study
- command line
- job, submitting from / Submission from the command line
- Hive job, running from / Running a Hive job from the command line
- config-default.xml file
- defining / The config-default.xml file
- continuous delivery
- defining / Packaging and continuous delivery
- control flow nodes / Control flow nodes
- Coordinator
- declaring / My first Coordinator
- v1 definition / Coordinator v1 definition
- v2 definition / Coordinator v2 definition
- Coordinator controls
- defining / Coordinator controls
- timeout / Coordinator controls
- concurrency / Coordinator controls
- execution / Coordinator controls
- throttle / Coordinator controls
- Coordinator jobs
- parameterization / Parameterization of Coordinator jobs
- dateOffset(String baseDate, int instance, String timeUnit) / dateOffset(String baseDate, int instance, String timeUnit)
- dateTzOffet(String baseDate, String timezone) / dateTzOffet(String baseDate, String timezone)
- formatTime(String timeStamp, String format) / formatTime(String timeStamp, String format)
- Coordinator rerun
- URL / Rerun Bundle
- coordinators
- about / Coordinators
- nominal time / Coordinators
- actual time / Coordinators
- datasets / Datasets
- initial instance / Initial instance
- Coordinator v1 definition
- defining / Coordinator v1 definition
- job.properties v1 definition / job.properties v1 definition
- Coordinator v2 definition
- defining / Coordinator v2 definition
- job.properties v2 definition / job.properties v2 definition
- job log, checking / Checking the job log
D
- DAG
- URL / Workflows
- data pipelines
- defining / Data pipelines
- datasets
- defining / Datasets
- frequency and time / Frequency and time
- cron syntax, for frequency / Cron syntax for frequency
- timezone / Timezone
- <done-flag> tag / The <done-flag> tag
- daylight savings
- URL / Timezone
- Decision node
- defining / The Decision node
E
- EL functions
- URL / HDFS EL functions
- EL functions, for datasets
- URL / latest(int n)
- Email action
- defining / The Email action
- URL / The Email action
- email action configuration
- defining / Email action configuration
- Expression Language functions
- defining / Expression Language functions
- types / Expression Language functions
- basic EL constants / Basic EL constants, Basic EL functions
- workflow EL functions / Workflow EL functions
- Hadoop EL constants / Hadoop EL constants
- HDFS EL functions / HDFS EL functions
H
- --hivevar flag
- URL / Hive 2 action
- Hadoop
- use cases / Book case study
- HCatalog
- defining / HCatalog
- datasets / HCatalog datasets
- EL functions / HCatalog EL functions
- Coordinator functions / HCatalog Coordinator functions
- Pig script / Pig script
- job.properties file / The job.properties file
- Sqoop action Coordinator / The Sqoop action Coordinator
- HCatalog Coordinator functions
- defining / HCatalog Coordinator functions
- HCatalog datasets / HCatalog datasets
- HCatalog EL functions / HCatalog EL functions
- Hive 2 action
- defining / Hive 2 action
- Hive action
- about / Hive action
- Hive job
- running, from command line / Running a Hive job from the command line
- Hortonworks distribution
- Oozie, configuring in / Configuring Oozie in Hortonworks distribution
- Hue
- installing / Installing and configuring Hue
- configuring / Installing and configuring Hue
- URL / Installing and configuring Hue
- Hue 3.9.0
- URL / Chapter case study
I
- input and output events, Dataset
- parameters / Parameters in the Dataset's input and output events
- current(int n) / current(int n)
- hoursInDay(int n) / hoursInDay(int n)
- daysInMonth(int n) / daysInMonth(int n)
- latest(int n) / latest(int n)
J
- job.properties file / The job.properties file
- job property file
- defining / Job property file
K
- Kerberos principle
- URL / Oozie in secured cluster
L
- Lambda architecture
- URL / Datasets
M
- <master> element
- about / Spark action
- <mode> element
- about / Spark action
- MapReduce jobs
- running, from Oozie / Running MapReduce jobs from Oozie
- job.properties file / The job.properties file
- running / Running the job
- MapReduce streaming job
- running / Running a MapReduce streaming job
N
- node types
- defining / Types of nodes
- control flow nodes / Control flow nodes
- action nodes / Action nodes
- decision node / Chapter case study
- email node / Chapter case study
- Pig Processing node / Chapter case study
O
- Oozie
- configuring, in Hortonworks distribution / Configuring Oozie in Hortonworks distribution
- defining / Oozie concepts
- workflows / Workflows
- coordinator / Coordinator
- bundles / Bundles
- running / Running our first Oozie job
- web console / Oozie web console
- command line / The Oozie command line
- MapReduce jobs, running from / Running MapReduce jobs from Oozie
- defining, in secured cluster / Oozie in secured cluster
- Oozie documentation
- Oozie Fs Action documentation
- Oozie installation, with tar ball
- performing / Installing Oozie using tar ball
- test virtual machine, creating / Creating a test virtual machine
- source code, building / Building Oozie source code
- build script / Summary of the build script
- Codehaus Maven move / Codehaus Maven move
- dependency jars, downloading / Download dependency jars
- preparing, for creating war file / Preparing to create a WAR file
- war file, creating / Create a WAR file
- Oozie MySQL database, configuring / Configure Oozie MySQL database
- shared library, configuring / Configure the shared library
- server test, starting / Start server testing and verification
- Oozie MapReduce job
- running / Running Oozie MapReduce job
- Oozie Workflow
- validating / Validating Oozie Workflow
P
- packaging
- defining / Packaging and continuous delivery
- pictorial representation, of Bundles job flow
- URL / Bundles
- Pig action
- defining / Pig action
- Pig code
- URL / Chapter case study
- Pig command line
- defining / The Pig command line
- Pig Coordinator job v2
- defining / Pig Coordinator job v2
- Pig Coordinator job v3
- defining / Pig Coordinator job v3
- Pig script / Pig script
- Python mapper and reducer code
Q
- Quartz scheduler
R
- rerun
- defining / Rerun
- Workflow, rerunning / Rerun Workflow
- Coordinator, rerunning / Rerun Coordinator
- Bundle, rerunning / Rerun Bundle
S
- <spark-opts> element
- about / Spark action
- SimpleDateFormat
- Spark action
- defining / Spark action
- Sqoop action
- defining / Sqoop action
- Sqoop action Coordinator
- job, running / Running the job
- data, checking in Hive table / Checking data in the Hive table
- Sqoop command line
- running / Running Sqoop command line
- states, workflow job
- PREP / Workflow states
- RUNNING / Workflow states
- SUSPENDED / Workflow states
- SUCCEEDED / Workflow states
- KILLED / Workflow states
- FAILED / Workflow states
T
- TZ reference
- URL / Timezone
W
- Workflow rerun
- URL / Rerun Bundle
- workflow states
- defining / Workflow states
X
- XSD
- references / Installing and configuring Hue
- XSD schema
- URL / Action nodes, Job property file