Book Image

Teradata Cookbook

By : Abhinav Khandelwal, Viswanath Kasi, Rajsekhar Bhamidipati
Book Image

Teradata Cookbook

By: Abhinav Khandelwal, Viswanath Kasi, Rajsekhar Bhamidipati

Overview of this book

Teradata is an enterprise software company that develops and sells its eponymous relational database management system (RDBMS), which is considered to be a leading data warehousing solutions and provides data management solutions for analytics. This book will help you get all the practical information you need for the creation and implementation of your data warehousing solution using Teradata. The book begins with recipes on quickly setting up a development environment so you can work with different types of data structuring and manipulation function. You will tackle all problems related to efficient querying, stored procedure searching, and navigation techniques. Additionally, you’ll master various administrative tasks such as user and security management, workload management, high availability, performance tuning, and monitoring. This book is designed to take you through the best practices of performing the real daily tasks of a Teradata DBA, and will help you tackle any problem you might encounter in the process.
Table of Contents (19 chapters)
Title Page
Dedication
Packt Upsell
Contributors
Preface
Index

Introduction


Statistics, by definition, are the collection, organization, analysis, interpretation, and presentation of data.

-Source Wikipedia

The mathematical study of the theoretical nature of such distributions and tests.

-Source Dictionary

In Teradata, the STATISTICS command will gather and store demographic data for one or more columns or indices of a table or join index.

Statistics help in analyzing things based on aggregation of data. They turn data into useful information so that actions can be taken or predictions can be made.

The same goes for Teradata. The Teradata optimizer uses statistics to develop plans for query executions:

The optimizer generates several plans before choosing the most optimized one, based on cost. The estimations that we see in explain plans are derived from data demographics of the table which the optimizer collects while doing a statistics collection.

Stats collection can be a resource-intensive operation if large tables are involved; hence, it needs to be scheduled...