Index

A

Abstract syntax tree (AST)
- about / The EXPLAIN statement
ACLs
- on HDFS, URL / Storage-based mode
Advanced Encryption Standard (AES)
- URL / Encryption
aggregate functions / Operators and functions
aggregation
- data aggregation / Basic aggregation – GROUP BY
- without GROUP BY columns / Basic aggregation – GROUP BY
- with GROUP BY columns / Basic aggregation – GROUP BY
- advanced / Advanced aggregation – GROUPING SETS, Advanced aggregation – ROLLUP and CUBE
- ROLLUP statement / Advanced aggregation – ROLLUP and CUBE
- CUBE statement / Advanced aggregation – ROLLUP and CUBE
- condition, HAVING statement / Aggregation condition – HAVING
Amazon EMR
- URL / Starting Hive in the cloud
analytic functions
- about / Analytic functions
- Function (arg1,..., argn) / Analytic functions
- Standard aggregations / Analytic functions
- RANK / Analytic functions
- DENSE_RANK / Analytic functions
- ROW_NUMBER / Analytic functions
- CUME_DIST / Analytic functions
- PERCENT_RANK / Analytic functions
- NTILE / Analytic functions
- LEAD function / Analytic functions
- LAG function / Analytic functions
- FIRST_VALUE / Analytic functions
- LAST_VALUE / Analytic functions
- window expressions / Analytic functions
ANALYZE statement
- about / The ANALYZE statement
ANTLR
- URL / The EXPLAIN statement
Apache
- used, for installing Hive / Installing Hive from Apache
Apache Hive
- Wiki, URL / Using the Hive command line and Beeline
Apache Hive Wiki
- URL / HBase
Apache JIRA Hive-365
- URL / Understanding Hive data types
Atomicity, Consistency, Isolation, and Durability (ACID)
- about / Transactions
authentication
- about / Authentication
- Metastore server authentication / Metastore server authentication
- HiveServer2 authentication / HiveServer2 authentication
authorization
- about / Authorization
- legacy mode / Legacy mode
- storage-based mode / Storage-based mode
- SQL standard-based mode / SQL standard-based mode
Avro
- URL / SerDe
AvroSerDe / SerDe
Azure HDInsight Service
- URL / Starting Hive in the cloud

B

batch processing
- about / Batch, real-time, and stream processing
Beeline
- using / Using the Hive command line and Beeline
- URL / Using the Hive command line and Beeline
- command-line syntax / Using the Hive command line and Beeline
big data
- about / Introducing big data
- Volume / Introducing big data
- volume / Introducing big data
- velocity / Introducing big data
- variety / Introducing big data
- veracity / Introducing big data
- variability / Introducing big data
- volatility / Introducing big data
- visualization / Introducing big data
- value / Introducing big data
block sampling / Sampling
bucket map join / Bucket map join
buckets
- about / Hive buckets
- number / Hive buckets
bucket tables
- about / Bucket tables
bucket table sampling / Sampling

C

cloud
- Hive, starting / Starting Hive in the cloud
Cloudera
- URL / Starting Hive in the cloud
- about / JDBC / ODBC connector
Cloudera Distributed Hadoop (CDH)
- URL / Installing Hive from vendor packages
CLUSTER BY / ORDER and SORT
collection functions / Operators and functions
collection item delimiter / Understanding Hive data types
ColumnarSerDe / SerDe
CombineFileInputFormat / Storage optimization
common join, join optimization / Common join
Common Table Expression (CTE) / Hive internal and external tables
Common Table Expression (CTE) / Hive internal and external tables
compression / Compression
conditional functions / Operators and functions
Cost-Based Optimizer (CBO)
- about / The ANALYZE statement
Cost Base Optimizer (CBO) / Hive roadmap
CREATE TABLE / Hive internal and external tables
Create the table as select (CTAS) / Hive internal and external tables
CROSS JOIN statement / The OUTER JOIN and CROSS JOIN statements
CUBE statement
- about / Advanced aggregation – ROLLUP and CUBE

D

data aggregation
- about / Basic aggregation – GROUP BY
database, Hive
- about / Hive database
data exchange
- LOAD keyword / Data exchange – LOAD
- INSERT keyword / Data exchange – INSERT
data exchange
- EXPORT statement / Data exchange – EXPORT and IMPORT
- IMPORT statement / Data exchange – EXPORT and IMPORT
data file optimization
- about / Data file optimization
- file format / File format
- compression / Compression
- storage optimization / Storage optimization
data type conversions
- about / Data type conversions
- primitive type conversion / Data type conversions
- explicit type conversion / Data type conversions
data type functions tips, complex / Operators and functions
data types, Hive
- about / Understanding Hive data types
- TINYINT / Understanding Hive data types
- SMALLINT / Understanding Hive data types
- INT / Understanding Hive data types
- BIGINT / Understanding Hive data types
- FLOAT / Understanding Hive data types
- DOUBLE / Understanding Hive data types
- DECIMAL / Understanding Hive data types
- BINARY / Understanding Hive data types
- BOOLEAN / Understanding Hive data types
- STRING / Understanding Hive data types
- CHAR / Understanding Hive data types
- VARCHAR / Understanding Hive data types
- DATE / Understanding Hive data types
- TIMESTAMP / Understanding Hive data types
date functions / Operators and functions
date function tips / Operators and functions
delimiters
- row delimiter / Understanding Hive data types
- collection item delimiter / Understanding Hive data types
- map key delimiter / Understanding Hive data types
deployment / Development and deployment
Derby
- URL / Installing Hive from Apache
design optimization
- about / Design optimization
- partition tables / Partition tables
- bucket tables / Bucket tables
- index / Index
development / Development and deployment
Directed Acyclical Graph (DAG) / Oozie
directed acyclic graphs (DAGs) / Index
DISTRIBUTE BY / ORDER and SORT

E

encryption
- about / Encryption
EXPLAIN statement
- about / The EXPLAIN statement
- EXTENDED keyword / The EXPLAIN statement
- DEPENDENCY keyword / The EXPLAIN statement
- AUTHORIZATION keyword / The EXPLAIN statement
explicit type conversion / Data type conversions
EXPORT statement / Data exchange – EXPORT and IMPORT
external tables
- about / Hive internal and external tables
/ Hive internal and external tables

F

file format, data file optimization
- about / File format
- TEXTFILE / File format
- SEQUENCEFILE / File format
- RCFILE / File format
- Optimized Row Columnar (ORC) / File format
- PARQUET / File format
Flume / Overview of the Hadoop ecosystem
functions
- about / Operators and functions
- mathematical functions / Operators and functions
- collection functions / Operators and functions
- type conversion functions / Operators and functions
- date functions / Operators and functions
- conditional functions / Operators and functions
- string functions / Operators and functions
- aggregate functions / Operators and functions
- table-generating functions / Operators and functions
- customized / Operators and functions
- complex data type functions tips / Operators and functions
- date function tips / Operators and functions
- CASE, for datatypes / Operators and functions
- parser and search tips / Operators and functions
- virtual columns / Operators and functions

G

GenericUDAF
- URL / The UDAF code template
GROUPING SETS keyword
- about / Advanced aggregation – GROUPING SETS

H

Hadoop
- versus relational database / Relational and NoSQL database versus Hadoop
- versus NoSQL database / Relational and NoSQL database versus Hadoop
Hadoop Archive
- and HAR / Storage optimization
Hadoop Archive File (HAR) / File format
Hadoop ecosystem
- about / Overview of the Hadoop ecosystem
HAVING statement
- about / Aggregation condition – HAVING
HBase
- about / HBase
- URL / HBase
- table, creating in HQL / HBase
HBaseSerDe / SerDe
HCatalog
- about / HCatalog
- URL / HCatalog
HDFS
- about / Batch, real-time, and stream processing, Overview of the Hadoop ecosystem
HDFS federation / Storage optimization
Hive
- about / Hive overview
- installing, from Apache / Installing Hive from Apache
- URL / Installing Hive from Apache
- installing, from vendor packages / Installing Hive from vendor packages
- starting, in cloud / Starting Hive in the cloud
- data types / Understanding Hive data types
- complex types / Understanding Hive data types
- types / Understanding Hive data types
- database / Hive database
- internal tables / Hive internal and external tables
- external tables / Hive internal and external tables
- partitions / Hive partitions
- buckets / Hive buckets
- views / Hive views
- performance utilities / Performance utilities
Hive, complex types
- ARRAY / Understanding Hive data types
- MAP / Understanding Hive data types
- STRUCT / Understanding Hive data types
- NAMED STRUCT / Understanding Hive data types
- UNION / Understanding Hive data types
Hive-integrated development environment (IDE)
- about / The Hive-integrated development environment
hive.map.aggr property / Basic aggregation – GROUP BY
Hive CLI
- command-line syntax / Using the Hive command line and Beeline
- URL / Using the Hive command line and Beeline
Hive command line
- using / Using the Hive command line and Beeline
Hive Data Definition Language (DDL)
- about / Hive Data Definition Language
Hive join optimization
- URL / Skew join
Hive roadmap
- about / Hive roadmap
HiveServer2
- URL / Using the Hive command line and Beeline
HiveServer2 authentication
- none authentication / HiveServer2 authentication
- Kerberos authentication / HiveServer2 authentication
- LDAP authentication / HiveServer2 authentication
- pluggable custom authentication / HiveServer2 authentication
- Pluggable Authentication Modules (PAM) authentication / HiveServer2 authentication
Hive Wiki
- URL / Operators and functions
Hortonworks
- URL / JDBC / ODBC connector
HQL
- about / Hive overview
Hue
- URL / The Hive-integrated development environment, Hue
- about / Hue

I

Impala
- URL / A short history
IMPORT statement / Data exchange – EXPORT and IMPORT
index
- about / Index
INNER JOIN statement / The INNER JOIN statement
INSERT keyword / Data exchange – INSERT
internal tables
- about / Hive internal and external tables
/ Hive internal and external tables

J

Java IDE
- URL / Development and deployment
Java Virtual Machine (JVM) / Batch, real-time, and stream processing
javax.script API
- URL / User-defined functions
JDBC/ODBC connector
- about / JDBC / ODBC connector
job and query optimization
- about / Job and query optimization
- local mode / Local mode
- JVM reuse / JVM reuse
- parallel execution / Parallel execution
join optimization
- about / Join optimization
- common join / Common join
- map join / Map join
- bucket map join / Bucket map join
- Sort merge bucket (SMB) join / Sort merge bucket (SMB) join
- Sort merge bucket map (SMBM) join / Sort merge bucket map (SMBM) join
- skew join / Skew join
JSONSerDe
- URL / SerDe
- about / SerDe
JVM reuse, job and query optimization / JVM reuse

K

Kerberos
- about / Authentication
Kerberos authentication / HiveServer2 authentication
Key Distribution Center (KDC) / Authentication

L

LazySimpleSerDe / SerDe
LDAP authentication / HiveServer2 authentication
legacy mode, authorization
- about / Legacy mode
Live Long And Process (LLAP) / Hive roadmap
LOAD keyword / Data exchange – LOAD
local mode, job and query optimization / Local mode

M

map join, join optimization / Map join
MAPJOIN statement / Special JOIN – MAPJOIN
map key delimiter / Understanding Hive data types
mathematical functions / Operators and functions
Maven
- URL / Development and deployment
metastore / Hive overview
Metastore server authentication
- about / Metastore server authentication
MIT Kerberos
- URL / Authentication
MySQL
- URL / Installing Hive from Apache

N

none authentication / HiveServer2 authentication
NoSQL database
- versus Hadoop / Relational and NoSQL database versus Hadoop

O

Oozie
- about / Oozie
- URL / Oozie
- control flow node / Oozie
- action node / Oozie
OpenCSVSerDe / SerDe
operators
- about / Operators and functions
Optimized Row Columnar (ORC) / Index, File format
Optimized Row Columnar (ORC) file
- about / Transactions
ORDER BY (ASC|DESC) keyword / ORDER and SORT
ORDER keyword / ORDER and SORT
OUTER JOIN statement / The OUTER JOIN and CROSS JOIN statements
Out Of Memory (OOM) exceptions / The INNER JOIN statement

P

parallel execution, job and query optimization / Parallel execution
ParquetHiveSerDe / SerDe
parser and search tips / Operators and functions
PARTITION BY statement / Analytic functions
partitions
- about / Hive partitions
partition tables
- by date and time / Partition tables
- by locations / Partition tables
- by business logics / Partition tables
personal identity information (PII)
- about / Encryption
Phoenix
- URL / HBase
Pluggable Authentication Modules (PAM) authentication / HiveServer2 authentication
pluggable custom authentication / HiveServer2 authentication
PostgreSQL
- URL / Installing Hive from Apache
Presto
- URL / A short history
primitive type conversion / Data type conversions
Processing Elements (PE) / Batch, real-time, and stream processing

R

random sampling
- URL / Sampling
real-time processing
- about / Batch, real-time, and stream processing
Record Columnar File (RCFILE) / File format
RegexSerDe / SerDe
relational database
- versus Hadoop / Relational and NoSQL database versus Hadoop
ROLLUP statement
- about / Advanced aggregation – ROLLUP and CUBE
row delimiter / Understanding Hive data types

S

sampling
- about / Sampling
- random sampling / Sampling
- bucket table sampling / Sampling
- block sampling / Sampling
SELECT * statement / The SELECT statement
SELECT statement / The SELECT statement
Sentry
- URL / SQL standard-based mode
SequenceFile format / Storage optimization
SerDe
- about / SerDe
- data, reading / SerDe
- data, writing / SerDe
- LazySimpleSerDe / SerDe
- ColumnarSerDe / SerDe
- RegexSerDe / SerDe
- HBaseSerDe / SerDe
- AvroSerDe / SerDe
- ParquetHiveSerDe / SerDe
- OpenCSVSerDe / SerDe
- JSONSerDe / SerDe
SHOW TRANSACTIONS command / Transactions
Simple Authentication and Security Layer (SASL) framework / Metastore server authentication
skew join / Skew join
SORT BY (ASC|DESC) keyword / ORDER and SORT
SORT keyword / ORDER and SORT
sort merge bucket (SMB) join / Sort merge bucket (SMB) join
sort merge bucket map (SMBM) join / Sort merge bucket map (SMBM) join
Spark / Overview of the Hadoop ecosystem
SQLLine
- URL / Using the Hive command line and Beeline
SQL standard-based mode, authorization
- about / SQL standard-based mode
Sqoop / Overview of the Hadoop ecosystem
stage dependencies
- about / The EXPLAIN statement
stage plans
- about / The EXPLAIN statement
storage-based mode, authorization
- about / Storage-based mode
storage optimization / Storage optimization
Storm
- URL / A short history, Batch, real-time, and stream processing
streaming
- about / Streaming
stream processing
- about / Batch, real-time, and stream processing
string functions / Operators and functions
Structured Query Language (SQL)
- about / A short history

T

table-generating functions / Operators and functions
Tez / Overview of the Hadoop ecosystem
- about / Index
- URL / Index
transactions
- about / Transactions
type conversion functions / Operators and functions

U

UDAF
- code, template / The UDAF code template
UDAFs
- about / User-defined functions
UDF
- code, template / The UDF code template
UDFs
- about / User-defined functions
UDTF
- code, template / The UDTF code template
UDTFs
- about / User-defined functions
Uniform Resource Identifier (URI) / Data exchange – LOAD
UNION ALL statement / Set operation – UNION ALL

V

value / Introducing big data
variability / Introducing big data
variety / Introducing big data
Vectorization optimization
- about / Index
- URL / Index
velocity / Introducing big data
vendor packages
- used, for installing Hive / Installing Hive from vendor packages
veracity / Introducing big data
views
- about / Hive views
- altering / Hive views
- redefining / Hive views
- dropping / Hive views
virtual columns / Operators and functions
visualization / Introducing big data
volatility / Introducing big data
volume / Introducing big data

W

WHERE clauses
- subqueries, restrictions / The SELECT statement
window expressions
- BETWEEN … AND clause / Analytic functions
- N PRECEDING or FOLLOWING / Analytic functions
- UNBOUNDED PRECEDING / Analytic functions
- UNBOUNDED FOLLOWING / Analytic functions
- UNBOUNDED PRECEDING AND UNBOUNED FOLLOWING / Analytic functions
- CURRENT ROW / Analytic functions
- URL / Analytic functions

Y

Yarn / Overview of the Hadoop ecosystem

Z

ZooKeeper
- about / ZooKeeper
- URL / ZooKeeper
- shared lock / ZooKeeper
- exclusive lock / ZooKeeper
- for Hive locks, URL / ZooKeeper

Apache Hive Essentials

By : Dayong Du

Apache Hive Essentials

By: Dayong Du

Overview of this book

Index

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

R

S

T

U

V

W

Y

Z

Apache Hive Essentials

By : Dayong Du

Apache Hive Essentials

By: Dayong Du

Overview of this book

Related Content you might be interested in

Current Title:

Apache Hive Essentials

Index

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

R

S

T

U

V

W

Y

Z