Book Image

Teradata Cookbook

By : Abhinav Khandelwal, Viswanath Kasi, Rajsekhar Bhamidipati
Book Image

Teradata Cookbook

By: Abhinav Khandelwal, Viswanath Kasi, Rajsekhar Bhamidipati

Overview of this book

Teradata is an enterprise software company that develops and sells its eponymous relational database management system (RDBMS), which is considered to be a leading data warehousing solutions and provides data management solutions for analytics. This book will help you get all the practical information you need for the creation and implementation of your data warehousing solution using Teradata. The book begins with recipes on quickly setting up a development environment so you can work with different types of data structuring and manipulation function. You will tackle all problems related to efficient querying, stored procedure searching, and navigation techniques. Additionally, you’ll master various administrative tasks such as user and security management, workload management, high availability, performance tuning, and monitoring. This book is designed to take you through the best practices of performing the real daily tasks of a Teradata DBA, and will help you tackle any problem you might encounter in the process.
Table of Contents (19 chapters)
Title Page
Dedication
Packt Upsell
Contributors
Preface
Index

Identifying skewness in joins


Skewness is the system killer. The magic of Teradata is in its parallelism, which distributes the work/data across many processing elements; this magic can turn into mush if the work/data is distributed in an uneven or disproportionate manner. Skew is when one or more of the Access Module Processors (AMPs) get a larger than average share of the work.

We need to understand that an absolute even distribution is rarely achievable on a single query event. It is recommended not to consider the operation skewed until the portion consumed by the hot AMP exceeds four to five times the average. 

Whatever kind of skewness there is on a system, it reduces and degrades system parallelism. When skewness occurs in a query, it slows down the join processing, and for that reason joining does not occur with full efficiency, which in turn consumes more CPU and runtime for the query.

The distribution of rows directly affects the benefits of parallelism. The more uniform the distribution...