Index
A
- access control lists, Hadoop
- configuring / Configuring access control lists in Hadoop
- Add Peer screen / Understanding the Administration menu
- administration commands,HDFS
- balancer / Commands to administer HDFS
- dfsadmin / Commands to administer HDFS
- Administration menu, Cloudera Manager Web console
- Settings screen / Understanding the Administration menu
- Alerts screen / Understanding the Administration menu
- Users screen / Understanding the Administration menu
- Kerberos screen / Understanding the Administration menu
- License screen / Understanding the Administration menu
- Language screen / Understanding the Administration menu
- Peers screen / Understanding the Administration menu
- AES256-CTS type / Configuring the KDC Server
- agent / Apache Flume NG
- alert
- about / Configuring events and alerts
- Health type, configuring / Configuring events and alerts
- Log type, configuring / Configuring events and alerts
- Activity type, configuring / Configuring events and alerts
- delivery configuration, by e-mail / Configuring the alert delivery by an e-mail
- alert delivery
- configuring, by e-mail / Configuring the alert delivery by an e-mail
- Alerts screen, Administration menu / Understanding the Administration menu
- All Configuration Issues tab, Home screen / Navigating the Home screen
- All Health Issues tab, Home screen / Navigating the Home screen
- All Recent Commands tab, Home screen / Navigating the Home screen
- Apache Avro / Apache Avro
- Apache Flume
- installing / Installing Apache Flume
- Apache Flume NG
- about / Apache Flume NG
- event/log data flow, to HDFS via agent / Apache Flume NG
- event/log data flow, to HDFS via multiple agent / Apache Flume NG
- Apache Hadoop
- history / History of Apache Hadoop and its trends
- about / History of Apache Hadoop and its trends, Apache Hadoop
- components / Components of Apache Hadoop
- daemons / Understanding the Apache Hadoop daemons
- Kerberos, configuring for / Configuring Kerberos for Apache Hadoop
- Apache Hadoop, components
- HDFS / Components of Apache Hadoop
- MapReduce / Components of Apache Hadoop
- Apache HBase
- about / Apache HBase
- working / Apache HBase
- Apache Hive
- about / Apache Hive
- high-level working / Apache Hive
- installing / Installing Apache Hive
- Apache Mahout / Apache Mahout
- Apache Oozie
- about / Apache Oozie
- installing / Installing Apache Oozie
- Apache Pig
- about / Apache Pig
- installing / Installing Apache Pig
- Apache Sqoop
- about / Apache Sqoop
- RDBMS to HDFS two-way flow / Apache Sqoop
- installing / Installing Apache Sqoop
- Apache Sqoop 2
- installing / Installing Apache Sqoop 2
- Apache Whirr
- about / Apache Whirr
- Apache ZooKeeper
- about / Apache ZooKeeper
- name service / Apache ZooKeeper
- locking / Apache ZooKeeper
- synchronization / Apache ZooKeeper
- configuration management / Apache ZooKeeper
- leader election / Apache ZooKeeper
- high-level working / Apache ZooKeeper
- installing / Installing Apache ZooKeeper
- ApplicationMaster / ResourceManager
- Audits tab, Clusters menu / Navigating the Clusters menu
- authentication / Understanding authentication and authorization
- Authentication Service / Understanding the Kerberos Architecture
- authorization
- about / Understanding authentication and authorization
- in Apache Hadoop / Authorization in Apache Hadoop
- authorization, in Apache Hadoop
- access control lists, configuring / Configuring access control lists in Hadoop
- automatic failover
- configuring, for HDFS high availability / Configuring automatic failover for HDFS high availability
- configuring, for Jobtracker HA / Configuring automatic failover for jobtracker high availability
B
- backup configuration
- Cloudera Manager, using / Configuring backups using Cloudera Manager
- HDFS replication, configuring / Configuring HDFS replication
- Hive replication, configuring / Configuring Hive replication
- snapshots, configuring / Configuring snapshots
- Backup menu, Cloudera Manager Web console / Understanding the Backup menu
- backups
- about / Understanding backups
- types / Types of backups
- storage media, types / Types of storage media for backups
- cloud service, using / Using cloud services for backups
- HDFS backups / Understanding HDFS backups
- balancer command / Commands to administer HDFS
- Balancer role, Cloudera Manager / Role management in Cloudera Manager
- Beeswax Hive UI
- about / Beeswax – Hive UI
- Query Editor section / Beeswax – Hive UI
- My Queries section / Beeswax – Hive UI
- Saved Queries section / Beeswax – Hive UI
- History section / Beeswax – Hive UI
- Settings section / Beeswax – Hive UI
- big data / History of Apache Hadoop and its trends
- block pool / Understanding the namenode UI
- Block Pool Id parameter / Understanding the namenode UI
- blocks / Essentials of HDFS
C
- Cache Statistics tab, Clusters menu / Navigating the Clusters menu
- cat command / Commonly used HDFS commands
- CDH
- about / Introducing CDH
- features / Introducing CDH
- starting with / Getting started with CDH
- components / Understanding the CDH components
- installing / Installing CDH
- installing, methods / Installing CDH
- CDH, components
- Apache Hadoop / Apache Hadoop
- Apache Flume NG / Apache Flume NG
- Apache Sqoop / Apache Sqoop
- Apache Pig / Apache Pig
- Apache Hive / Apache Hive
- Apache ZooKeeper / Apache ZooKeeper
- Apache HBase / Apache HBase
- Apache Whirr / Apache Whirr
- Apache Mahout / Apache Mahout
- Apache Avro / Apache Avro
- Apache Oozie / Apache Oozie
- Cloudera Search / Cloudera Search
- Cloudera Impala / Cloudera Impala
- Cloudera Hue / Cloudera Hue
- installing / Installing the CDH components
- CDH 4.5 / Introducing CDH
- CDH component installation
- Apache Flume / Installing Apache Flume
- Apache Sqoop / Installing Apache Sqoop
- Apache Sqoop 2 / Installing Apache Sqoop 2
- Apache Pig / Installing Apache Pig
- Apache Hive / Installing Apache Hive
- Apache Oozie / Installing Apache Oozie
- Apache ZooKeeper / Installing Apache ZooKeeper
- CDH installation
- about / Installing CDH
- Hadoop services, stopping / Stopping Hadoop services
- YARN cluster / Understanding a YARN cluster
- cell / Apache HBase
- channel / Apache Flume NG
- Charts Library tab, Clusters menu / Navigating the Clusters menu
- Charts menu, Cloudera Manager Web console / Understanding the Charts menu
- checkHealth option / Configuring HDFS high availability by shared storage using NFS
- Cloudera
- about / Introducing Cloudera
- Cloudera Hue
- about / Cloudera Hue
- home / Cloudera Hue
- Query section / Cloudera Hue
- Hadoop section / Cloudera Hue
- Workflow section / Cloudera Hue
- Beeswax Hive UI / Beeswax – Hive UI
- Cloudera Impala UI / Cloudera Impala UI
- Pig UI / Pig UI
- File Browser / File Browser
- Metastore Manager / Metastore Manager
- Sqoop Jobs / Sqoop Jobs
- Job Browser / Job Browser
- Job Designs / Job Designs
- Dashboard / Dashboard
- Collection Manager / Collection Manager
- Hue Shell / Hue Shell
- HBase Browser / HBase Browser
- Cloudera Impala / Cloudera Impala
- Cloudera Manager / Introducing Cloudera
- about / Introducing Cloudera Manager
- features / Introducing Cloudera Manager
- editions / Introducing Cloudera Manager
- Standard Edition / Introducing Cloudera Manager
- architecture / Understanding the Cloudera Manager architecture
- installing, methods / Installing Cloudera Manager
- installing, machine configuration / Installing Cloudera Manager
- installing, configuration / Installing Cloudera Manager
- URL / Installing Cloudera Manager
- used, for HDFS HA configuration / Configuring High Availability using Cloudera Manager
- used, for Hadoop services configuration / Configuring Hadoop services using Cloudera Manager, Adding a service to the cluster, Removing a service from the cluster
- used, for host management / Managing hosts using Cloudera Manager, Adding a new host, Removing an existing host
- used, for managing multiple cluster / Managing multiple clusters with Cloudera Manager
- Hadoop cluster, rebalancing from / Rebalancing a Hadoop cluster from Cloudera Manager, Adding the Balancer service to the cluster, Rebalancing the cluster
- Hadoop services, monitoring from / Monitoring Hadoop services from Cloudera Manager
- used, for backup configuration / Configuring backups using Cloudera Manager
- Cloudera Manager Agent
- about / Understanding the Cloudera Manager architecture
- installing, on cluster / Installing Cloudera Manager
- Cloudera Manager architecture
- Cloudera Manager server / Understanding the Cloudera Manager architecture
- Cloudera Manager Agent / Understanding the Cloudera Manager architecture
- Cloudera Manager Server
- installing, on cluster / Installing Cloudera Manager
- Kerberos principal, configuring for / Configuring Kerberos principal for Cloudera Manager Server
- configuring, for Kerberos / Configuring the Cloudera Manager Server for Kerberos
- Cloudera Manager server
- Cloudera Manager Standard Edition / Introducing Cloudera Manager
- Cloudera Manager Web console
- about / Navigating the Cloudera Manager Web console
- Home screen, navigating / Navigating the Home screen
- Clusters menu, navigating / Navigating the Clusters menu
- Hosts menu / Exploring the Hosts menu
- Diagnostics menu / Understanding the Diagnostics menu
- Audits screen / Understanding the Audits screen
- Charts menu / Understanding the Charts menu
- Backup menu / Understanding the Backup menu
- Administration menu / Understanding the Administration menu
- Cloudera Search / Cloudera Search
- cloud service
- using, for backup / Using cloud services for backups
- cluster / History of Apache Hadoop and its trends
- host, adding to / Adding a new host
- existing host, adding to / Removing an existing host
- Balancer service, adding to / Adding the Balancer service to the cluster
- rebalancing / Rebalancing the cluster
- Clusters menu, Cloudera Manager Web console
- navigating / Navigating the Clusters menu
- Status tab / Navigating the Clusters menu
- Instances tab / Navigating the Clusters menu
- Commands tab / Navigating the Clusters menu
- Configuration tab / Navigating the Clusters menu
- Audits tab / Navigating the Clusters menu
- Charts Library tab / Navigating the Clusters menu
- File Browser tab / Navigating the Clusters menu
- Cache Statistics tab / Navigating the Clusters menu
- Replications tab / Navigating the Clusters menu
- NameNode Web UI tab / Navigating the Clusters menu
- Cluster Summary section, jobtracker UI / Understanding the jobtracker UI
- collection / Apache Flume NG
- Collection Manager, Cloudera Hue / Collection Manager
- Commands tab, Clusters menu / Navigating the Clusters menu
- Completed Jobs section, jobtracker UI / Understanding the jobtracker UI
- Concerning Health status, for HDFS service
- reason, identifying / Monitoring Hadoop services from Cloudera Manager
- configuration, HDFS
- about / Configuring HDFS
- hdfs-site.xml file properties / Configuring HDFS
- configuration management service / Apache ZooKeeper
- Configuration tab, Clusters menu / Navigating the Clusters menu
- configuring
- MapReduce / Configuring MapReduce
- configuring, Jobtracker HA / Configuring jobtracker high availability
- copyFromLocal command / Commonly used HDFS commands
- copyToLocal command / Commonly used HDFS commands
- cp command / Commonly used HDFS commands
- Custom database, Hive Metastore / Adding a service to the cluster
D
- daemons, Apache Hadoop
- namenode / Namenode
- secondary namenode / Secondary namenode
- jobtracker / Jobtracker
- tasktracker / Tasktracker
- ResourceManager / ResourceManager
- NodeManager / NodeManager
- Dashboard, Cloudera Hue
- data
- storing / History of Apache Hadoop and its trends
- database / Understanding the Kerberos Architecture
- data locality / Jobtracker
- datanode daemon / Secondary namenode
- DataNode role, Cloudera Manager / Role management in Cloudera Manager
- adding, to host / Adding a DataNode role to a host
- dead state / Understanding the namenode UI
- deserialization / Apache Avro
- dfs.blocksize property / Configuring HDFS
- dfs.client.failover.proxy.provider.[NameserviceID] property / Configuring HDFS high availability by theQuorum-based storage
- dfs.ha.namenodes.[NameserviceID] property / Configuring HDFS high availability by theQuorum-based storage
- dfs.journalnode.edits.dir property / Configuring HDFS high availability by theQuorum-based storage
- dfs.namenode.checkpoint.dir property / Configuring HDFS Federation
- dfs.namenode.checkpoint.edits.dir property / Configuring HDFS Federation
- dfs.namenode.edits.dir property / Configuring HDFS Federation
- dfs.namenode.http-address.[NameserviceID].[name node ID] property / Configuring HDFS high availability by theQuorum-based storage
- dfs.namenode.http-address property / Configuring HDFS
- dfs.namenode.https-address property / Configuring HDFS Federation
- dfs.namenode.keytab.file property / Configuring HDFS Federation
- dfs.namenode.name.dir property / Configuring HDFS Federation
- dfs.namenode.rpc-address.[NameserviceID].[name node ID] property / Configuring HDFS high availability by theQuorum-based storage
- dfs.namenode.rpc-address property / Configuring HDFS Federation
- dfs.namenode.servicerpc-address property / Configuring HDFS
- dfs.namenode.shared.edits.dir property / Configuring HDFS high availability by theQuorum-based storage
- dfs.nameservices property / Configuring HDFS high availability by theQuorum-based storage
- dfs.nameservices prperty / Configuring HDFS Federation
- dfs.replication property / Configuring HDFS
- dfsadmin command / Commands to administer HDFS
- Diagnostics menu, Cloudera Manager Web console
- events / Understanding the Diagnostics menu
- logs / Understanding the Diagnostics menu
- Server Log / Understanding the Diagnostics menu
- differential backup / Types of backups
- DistCp
E
- Embedded database, Hive Metastore / Adding a service to the cluster
- events
- about / Understanding events and alerts
- viewing / Understanding events and alerts
- configuring, steps / Configuring events and alerts
- Events screen, Diagnostics menu / Understanding the Diagnostics menu
F
- Failed Jobs section, jobtracker UI / Understanding the jobtracker UI
- failover option / Configuring HDFS high availability by shared storage using NFS
- features, HDFS
- fault tolerance / Essentials of HDFS
- data streaming / Essentials of HDFS
- large data store / Essentials of HDFS
- portability / Essentials of HDFS
- easy interface / Essentials of HDFS
- file
- writing, in HDFS / Writing files in HDFS
- reading, in HDFS / Reading files in HDFS
- File Browser tab, Clusters menu / Navigating the Clusters menu
- fs.defaultFS property / Configuring HDFS high availability by theQuorum-based storage
- fs.permissions.umask-mode property / Configuring HDFS
- full backup / Types of backups
G
- getServiceState option / Configuring HDFS high availability by shared storage using NFS
- Google File System (GFS) / History of Apache Hadoop and its trends
H
- Hadoop administrator
- responsibilities / Responsibilities of a Hadoop administrator
- related operations / Responsibilities of a Hadoop administrator
- Hadoop cluster
- rebalancing, from Cloudera Manager / Rebalancing a Hadoop cluster from Cloudera Manager, Adding the Balancer service to the cluster, Rebalancing the cluster
- Hadoop clusters
- Hadoop daemons
- ports / Tasktracker
- Hadoop Distributed File System (HDFS) / History of Apache Hadoop and its trends
- Hadoop Process Definition Language (hPDL) / Apache Oozie
- Hadoop services
- monitoring, from Cloudera Manager / Monitoring Hadoop services from Cloudera Manager
- Hadoop services configuration, Cloudera Manager used
- service, adding to cluster / Adding a service to the cluster
- service, removing, from cluster / Removing a service from the cluster
- hard disks, storage media / Types of storage media for backups
- HBase Browser, Cloudera Hue / HBase Browser
- HDFS / Components of Apache Hadoop
- about / Essentials of HDFS
- features / Essentials of HDFS
- operation daemons / Essentials of HDFS
- configuring / Configuring HDFS
- file, writing / Writing files in HDFS
- file, reading / Reading files in HDFS
- rm -r command / Commonly used HDFS commands
- snapshot paths, enabling / Enabling snapshot paths in HDFS
- HDFS backups
- about / Understanding HDFS backups
- data sources, protecting / Understanding HDFS backups
- namenode metadata / Understanding HDFS backups
- Hive metastore / Understanding HDFS backups
- HBase RegionServer data / Understanding HDFS backups
- application configuration files / Understanding HDFS backups
- HDFS commands
- ls / Commonly used HDFS commands
- cat / Commonly used HDFS commands
- copyFromLocal / Commonly used HDFS commands
- copyToLocal / Commonly used HDFS commands
- cp / Commonly used HDFS commands
- mkdir / Commonly used HDFS commands
- mv / Commonly used HDFS commands
- rm / Commonly used HDFS commands
- setrep / Commonly used HDFS commands
- tail / Commonly used HDFS commands
- administration commands / Commands to administer HDFS
- HDFS Federation
- about / Implementing HDFS Federation
- configuring / Configuring HDFS Federation
- ViewFS, configuring for federated HDFS / Configuring ViewFS for a federated HDFS
- HDFS Federation configuration
- properties, using / Configuring HDFS Federation
- ViewFS, configuration / Configuring ViewFS for a federated HDFS
- HDFS HA
- about / Implementing HDFS High Availability
- configuring, CDH5 used / Implementing HDFS High Availability
- Quorum-based storage / The Quorum-based storage
- configuring, by Quorum-based storage / Configuring HDFS high availability by theQuorum-based storage
- setting up, hdfs-site.xml configuration file properties / Configuring HDFS high availability by theQuorum-based storage
- shared storage, using NFS / Shared storage using NFS
- automatic failover, configuring for / Configuring automatic failover for HDFS high availability
- configuring, Cloudera Manager used / Configuring High Availability using Cloudera Manager
- HDFS replication
- configuring / Configuring HDFS replication
- Hello, World program / Understanding the map phase
- High Availability (HA) / Namenode
- high availability (HA) / Configuring automatic failover for HDFS high availability
- Hive Gateway / Adding a service to the cluster
- Hive Metastore
- about / Adding a service to the cluster
- database types, Embedded / Adding a service to the cluster
- database types, Custom / Adding a service to the cluster
- HiveQL / Apache Hive
- Hive replication
- configuring / Configuring Hive replication
- Home screen, Cloudera Manager Web console
- Status tab / Navigating the Home screen
- All Health Issues tab / Navigating the Home screen
- All Configuration Issues tab / Navigating the Home screen
- All Recent Commands tab / Navigating the Home screen
- host
- DataNode role, adding to / Adding a DataNode role to a host
- TaskTracker role, adding to / Adding a TaskTracker role to a host
- managing, Cloudera Manager used / Managing hosts using Cloudera Manager
- adding, to cluster / Adding a new host
- removing, from cluster / Removing an existing host
- host management, Cloudera Manager
- host, adding / Adding a new host
- host, removing / Removing an existing host
- Hosts menu, Cloudera Manager Web console
- templates tab / Exploring the Hosts menu
- parcels tab / Exploring the Hosts menu
- hue user / Configuring Kerberos for Apache Hadoop
I
- incremental backup / Types of backups
- installation
- CDH components / Installing the CDH components, Installing Apache Hive, Installing Apache ZooKeeper
- Apache Flume / Installing Apache Flume
- Apache Sqoop 2 / Installing Apache Sqoop 2
- Apache Pig / Installing Apache Pig
- Apache Hive / Installing Apache Hive
- Apache Oozie / Installing Apache Oozie
- Apache ZooKeeper / Installing Apache ZooKeeper
- Kerberos / Introducing Kerberos, Installing Kerberos
- instance, principal / Understanding important Kerberos terms
- Instances tab, Clusters menu / Navigating the Clusters menu
- io.sort.factor property / Configuring MapReduce
- io.sort.mb property / Configuring MapReduce
J
- jar / Learning all about the MapReduce job flow
- Java Runtime Environment (JRE) / Configuring the KDC Server
- Job Browser, Cloudera Hue / Job Browser
- Job Designs, Cloudera Hue / Job Designs
- jobtracker daemon
- about / Jobtracker
- Jobtracker HA
- about / Jobtracker high availability
- configuring / Configuring jobtracker high availability
- automatic failover, configuring / Configuring automatic failover for jobtracker high availability
- JobTracker role, Cloudera Manager / Role management in Cloudera Manager
- jobtracker UI
- general Information section / Understanding the jobtracker UI
- Cluster Summary section / Understanding the jobtracker UI
- Scheduling Information section / Understanding the jobtracker UI
- Running Jobs section / Understanding the jobtracker UI
- Completed Jobs section / Understanding the jobtracker UI
- Failed Jobs section / Understanding the jobtracker UI
- Retired Jobs section / Understanding the jobtracker UI
- Local Logs section / Understanding the jobtracker UI
- JournalNode (JN) daemons / The Quorum-based storage
- JournalNodes / The Quorum-based storage
K
- KDC / Understanding the Kerberos Architecture
- KDC installation
- testing / Testing the KDC installation
- KDC Server
- configuring / Configuring the KDC Server
- kerberized / Understanding important Kerberos terms
- Kerberos
- about / Introducing Kerberos
- installing / Introducing Kerberos
- requirements / Introducing Kerberos
- architecture / Understanding the Kerberos Architecture
- Kerberos architecture
- about / Understanding the Kerberos Architecture
- authentication service component / Understanding the Kerberos Architecture
- database component / Understanding the Kerberos Architecture
- Ticket Granting Server component / Understanding the Kerberos Architecture
- user, authenticating / Authenticating a user
- secure file server, accessing / Accessing a secure file server
- kerberized / Understanding important Kerberos terms
- realm / Understanding important Kerberos terms
- principal / Understanding important Kerberos terms
- keys / Understanding important Kerberos terms
- keytab / Understanding important Kerberos terms
- Kerberos clients
- configuring / Configuring the Kerberos clients
- Kerberos configuration, for Apache Hadoop
- hdfs user / Configuring Kerberos for Apache Hadoop
- mapred user / Configuring Kerberos for Apache Hadoop
- yarn user / Configuring Kerberos for Apache Hadoop
- oozie user / Configuring Kerberos for Apache Hadoop
- hue user / Configuring Kerberos for Apache Hadoop
- Kerberos principal, configuring for Cloudera Manager Server / Configuring Kerberos principal for Cloudera Manager Server
- Cloudera Manager Server, configuring for Kerberos / Configuring the Cloudera Manager Server for Kerberos
- Kerberos installation
- KDC Server, configuring / Configuring the KDC Server
- KDC installation, testing / Testing the KDC installation
- clients, installing / Configuring the Kerberos clients
- Kerberos principal
- configuring, for Cloudera Manager Server / Configuring Kerberos principal for Cloudera Manager Server
- Kerberos screen, Administration menu / Understanding the Administration menu
- key pair / Understanding the map phase
- keys, Kerberos / Understanding important Kerberos terms
- keytab, Kerberos / Understanding important Kerberos terms
L
- Language screen, Administration menu / Understanding the Administration menu
- Leader election service / Apache ZooKeeper
- License screen, Administration menu / Understanding the Administration menu
- Lightweight Directory Access Protocol (LDAP) / Understanding the Administration menu
- Lightweight Directory Access Protocol ( LDAP ) / Introducing Cloudera Manager
- Local Logs section, jobtracker UI
- about / Understanding the jobtracker UI
- Job Tracker History / Understanding the jobtracker UI
- locking service / Apache ZooKeeper
- Logs screen, Diagnostics menu / Understanding the Diagnostics menu
- ls command / Commonly used HDFS commands
M
- mapred.child.java.opts property / Configuring MapReduce
- mapred.compress.map.output property / Configuring MapReduce
- mapred.job.reuse.jvm.num.tasks property / Configuring MapReduce
- mapred.job.tracker property / Configuring MapReduce
- mapred.map.output.compression.codec property / Configuring MapReduce
- mapred.map.tasks.speculative.execution property / Configuring MapReduce
- mapred.output.compression.codec property / Configuring MapReduce
- mapred.output.compression.type property / Configuring MapReduce
- mapred.output.compress property / Configuring MapReduce
- mapred.reduce.parallel.copies property / Configuring MapReduce
- mapred.reduce.slowstart.completed.maps property / Configuring MapReduce
- mapred.reduce.tasks.speculative.execution property / Configuring MapReduce
- mapred.reduce.tasks property / Configuring MapReduce
- mapred.submit.replication property / Configuring MapReduce
- mapred.userlog.retain.hours property / Configuring MapReduce
- MapReduce / History of Apache Hadoop and its trends, Components of Apache Hadoop
- about / Getting acquainted with MapReduce
- in Hadoop / Getting acquainted with MapReduce
- processing functions / Getting acquainted with MapReduce
- map phase / Understanding the map phase
- reduce phase / Understanding the reduce phase
- job flow / Learning all about the MapReduce job flow
- configuring / Configuring MapReduce
- jobtracker UI / Understanding the jobtracker UI
- MapReduce, Hadoop Version 1.x
- MapReduce user framework / Tasktracker
- MapReduce system / Tasktracker
- mapreduce.job.counters.max property / Configuring MapReduce
- MapReduce job flow
- MapReduce Version 1 (MRv1) / Installing CDH
- mapred user / Configuring Kerberos for Apache Hadoop
- Massachusetts Institute of Technology (MIT) / Introducing Kerberos
- Metastore Manager, Cloudera Hue
- using / Metastore Manager
- about / Metastore Manager
- mirror backup / Types of backups
- mkdir command / Commonly used HDFS commands
- multi-hop flows / Apache Flume NG
- multiple clusters
- managing, with Cloudera Manager / Managing multiple clusters with Cloudera Manager
- multiple servers (ensemble) / Apache ZooKeeper
- mv command / Commonly used HDFS commands
N
- Namenode-1(NN1) / Implementing HDFS Federation
- Namenode-2 (NN2) / Implementing HDFS Federation
- namenode daemon
- Namenode Journal Status section, namenode UI
- NameNode Storage / Understanding the namenode UI
- NameNode role, Cloudera Manager / Role management in Cloudera Manager
- namenode UI
- about / Understanding the namenode UI
- overview section / Understanding the namenode UI
- summary section / Understanding the namenode UI
- Namenode Journal Status section / Understanding the namenode UI
- NameNode Storage section / Understanding the namenode UI
- NameNode Web UI tab, Clusters menu / Navigating the Clusters menu
- name service / Apache ZooKeeper
- NameserviceID / Configuring HDFS Federation
- NFS
- used, for shared storage / Shared storage using NFS
- used, for HDFS HA configuration by shared storage / Configuring HDFS high availability by shared storage using NFS
- used, for HDFS HA configuration by shared storageTopicnNameNode Journal Status, for Quorum-based storage approach / NameNode Journal Status for Quorum-based storage approach
- used, for HDFS HA configuration by shared storageTopicnNameNode Journal Status, for shared storage-based approach / NameNode Journal Status for the Shared Storage-based approach
- NodeManager daemon
- about / NodeManager
- Nutch Distributed File System (NDFS) / History of Apache Hadoop and its trends
O
- oozie user / Configuring Kerberos for Apache Hadoop
- optical storage, storage media
- compact discs (CD) / Types of storage media for backups
- digital video discs (DVD) / Types of storage media for backups
- Blu-ray Discs (BD) / Types of storage media for backups
P
- parcel tab, Host menu / Exploring the Hosts menu
- Peers screen, Administration menu / Understanding the Administration menu
- Pig Latin / Apache Pig
- Pig UI, Cloudera Hue
- primary, principal / Understanding important Kerberos terms
- principal, Kerberos
- primary / Understanding important Kerberos terms
- instance / Understanding important Kerberos terms
- realm / Understanding important Kerberos terms
Q
- Quorum-based storage
- about / The Quorum-based storage
- HDFS HA, configuring by / Configuring HDFS high availability by theQuorum-based storage
- Quorum Journal Manager (QJM) / The Quorum-based storage
R
- realm, Kerberos / Understanding important Kerberos terms
- realm, principal / Understanding important Kerberos terms
- reduce phase / Understanding the reduce phase
- RegionServer / Apache HBase
- replication / Configuring HDFS replication
- Replications tab, Clusters menu / Navigating the Clusters menu
- ResourceManager daemon / ResourceManager
- Retired Jobs section, jobtracker UI / Understanding the jobtracker UI
- rm -r command / Commonly used HDFS commands
- rm command / Commonly used HDFS commands
- role instance
- adding, to host / Adding a role instance to a host
- roles, Cloudera Manager
- about / Role management in Cloudera Manager
- balancer / Role management in Cloudera Manager
- DataNode / Role management in Cloudera Manager
- NameNode / Role management in Cloudera Manager
- Secondary NameNode / Role management in Cloudera Manager
- job tracker / Role management in Cloudera Manager
- Task Tracker / Role management in Cloudera Manager
- Running Jobs section
- general information / Getting MapReduce job information
- Map and reduce progress information / Getting MapReduce job information
- Counter information / Getting MapReduce job information
- Map and reduce completion graphs / Getting MapReduce job information
- Running Jobs section, jobtracker UI / Understanding the jobtracker UI
S
- Scheduling Information section, jobtracker UI / Understanding the jobtracker UI
- secondary namenode
- about / Secondary namenode
- steps / Secondary namenode
- SecondaryNameNode role, Cloudera Manager / Role management in Cloudera Manager
- secondary namenode UI
- about / Understanding the secondary namenode UI
- Name Node Address / Understanding the secondary namenode UI
- Start Time / Understanding the secondary namenode UI
- Last Checkpoint Time / Understanding the secondary namenode UI
- Checkpoint Period / Understanding the secondary namenode UI
- Checkpoint Size / Understanding the secondary namenode UI
- Checkpoint Dirs / Understanding the secondary namenode UI
- Checkpoint Edit Dirs / Understanding the secondary namenode UI
- secure file server
- accessing / Accessing a secure file server
- Security-Enhanced Linux (SELinux) / Installing Cloudera Manager
- security.client.datanode.protocol.acl property / Configuring access control lists in Hadoop
- security.client.protocol.acl property / Configuring access control lists in Hadoop
- security.datanode.protocol.acl property / Configuring access control lists in Hadoop
- security.ha.service.protocol.acl property / Configuring access control lists in Hadoop
- security.namenode.protocol.acl property / Configuring access control lists in Hadoop
- security.refresh.policy.protocol.acl property / Configuring access control lists in Hadoop
- Security Assertion Markup Language (SAML) / Understanding the Administration menu
- serialization / Apache Avro
- Server Log screen, Diagnostics menu / Understanding the Diagnostics menu
- Service Level Agreement (SLA) / Implementing HDFS High Availability
- setrep command / Commonly used HDFS commands
- Setting screen
- Performance / Understanding the Administration menu
- Advanced / Understanding the Administration menu
- Thresholds / Understanding the Administration menu
- Security / Understanding the Administration menu
- Ports and Addresses / Understanding the Administration menu
- Other / Understanding the Administration menu
- Support / Understanding the Administration menu
- External Authentication / Understanding the Administration menu
- Parcels / Understanding the Administration menu
- Network / Understanding the Administration menu
- Custom Service Descriptors / Understanding the Administration menu
- settings screen, Administration menu / Understanding the Administration menu
- shuffle and sort phase / Understanding the reduce phase
- single point of failure (SPOF) / Implementing HDFS High Availability
- sink / Apache Flume NG
- slots / Learning all about the MapReduce job flow
- Snappy
- snapshot policy
- configuring / Configuring a snapshot policy
- snapshots configuration
- snapshot paths, enabling in HDFS / Enabling snapshot paths in HDFS
- snapshot policy, configuring / Configuring a snapshot policy
- SNMP (Simple Network Management Protocol) / Introducing Cloudera Manager
- Sqoop Jobs, Cloudera Hue / Sqoop Jobs
- Status tab, Clusters menu / Navigating the Clusters menu
- Status tab, Home screen / Navigating the Home screen
- storage media, types
- hard disks / Types of storage media for backups
- optical storage / Types of storage media for backups
- tape drives / Types of storage media for backups
- summary section, namenode UI
- Configured Capacity / Understanding the namenode UI
- DFS Used / Understanding the namenode UI
- Non DFS Used / Understanding the namenode UI
- DFS Remaining / Understanding the namenode UI
- DFS Used% / Understanding the namenode UI
- DFS Remaining% / Understanding the namenode UI
- Block Pool Used / Understanding the namenode UI
- Block Pool Used% / Understanding the namenode UI
- DataNodes usages% (Min, Median, Max, stdDev) / Understanding the namenode UI
- Live Nodes / Understanding the namenode UI
- Dead Nodes / Understanding the namenode UI
- Decommissioning Nodes / Understanding the namenode UI
- Number of Under-Replicated Blocks / Understanding the namenode UI
- synchronization service / Apache ZooKeeper
T
- $target_address variable / Configuring HDFS high availability by theQuorum-based storage
- $target_host variable / Configuring HDFS high availability by theQuorum-based storage
- $target_namenodeid variable / Configuring HDFS high availability by theQuorum-based storage
- $target_nameserviceid variable / Configuring HDFS high availability by theQuorum-based storage
- $target_port variable / Configuring HDFS high availability by theQuorum-based storage
- tail command / Commonly used HDFS commands
- tape drives, storage media / Types of storage media for backups
- tasktracker daemon
- about / Tasktracker
- TaskTracker role, Cloudera Manager
- about / Role management in Cloudera Manager
- adding, to host / Adding a TaskTracker role to a host
- templates tab, Host menu / Exploring the Hosts menu
- Ticket Granting Server / Understanding the Kerberos Architecture
- Ticket Granting Ticket (TGT) / Authenticating a user
- transitionToActive option / Configuring HDFS high availability by shared storage using NFS
- transitionToStandby option / Configuring HDFS high availability by shared storage using NFS
- types, backup
- full backup / Types of backups
- incremental backup / Types of backups
- differential backup / Types of backups
- mirror backup / Types of backups
U
- URI / Exploring HDFS commands
- used, for HDFS HA configuration by shared storage
- NameNode Journal Status, for shared storage-based approach / NameNode Journal Status for the Shared Storage-based approach
- user
- authenticating / Authenticating a user
- Users screen, Administration menu / Understanding the Administration menu
V
- value pair / Understanding the map phase
- ViewFS
- about / Configuring ViewFS for a federated HDFS
- configuring, for federated HDFS / Configuring ViewFS for a federated HDFS
W
- workflow / Apache Oozie
Y
- YARN
- about / Understanding the Apache Hadoop daemons, Tasktracker
- job submission / Job submission in YARN
- YARN cluster
- daemons / Understanding a YARN cluster
- about / Understanding a YARN cluster
- yarn user / Configuring Kerberos for Apache Hadoop
Z
- ZKFC component / Configuring automatic failover for HDFS high availability
- ZKFC service
- operations, health monitoring / Configuring automatic failover for HDFS high availability
- operations, ZooKeeper session management / Configuring automatic failover for HDFS high availability
- operations, ZooKeeper-based election / Configuring automatic failover for HDFS high availability
- znode (ZooKeeper node) / Apache ZooKeeper
- ZooKeeper
- configuring, for automatic failover / Configuring automatic failover for HDFS high availability
- ZooKeeper service
- operations, failure detection / Configuring automatic failover for HDFS high availability
- operations, Active NameNode elections / Configuring automatic failover for HDFS high availability