Book Image

Teradata Cookbook

By : Abhinav Khandelwal, Viswanath Kasi, Rajsekhar Bhamidipati
Book Image

Teradata Cookbook

By: Abhinav Khandelwal, Viswanath Kasi, Rajsekhar Bhamidipati

Overview of this book

Teradata is an enterprise software company that develops and sells its eponymous relational database management system (RDBMS), which is considered to be a leading data warehousing solutions and provides data management solutions for analytics. This book will help you get all the practical information you need for the creation and implementation of your data warehousing solution using Teradata. The book begins with recipes on quickly setting up a development environment so you can work with different types of data structuring and manipulation function. You will tackle all problems related to efficient querying, stored procedure searching, and navigation techniques. Additionally, you’ll master various administrative tasks such as user and security management, workload management, high availability, performance tuning, and monitoring. This book is designed to take you through the best practices of performing the real daily tasks of a Teradata DBA, and will help you tackle any problem you might encounter in the process.
Table of Contents (19 chapters)
Title Page
Dedication
Packt Upsell
Contributors
Preface
Index

Resolving skewing data


In statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. And in Teradata, it is defined as imbalanced processing, caused by uneven distribution. Highly skewed means some AMPs have more rows and some much less, as in data is not properly/evenly distributed. We can have data skew, CPU skew, and IO skew.

Shared Nothing architecture – dividing the work

The shared nothing architecture ensures that each virtual processor is responsible for the storage and retrieval of its own unique data. Data is stored physically together on the node, but the virtual processors ensure parallelism. This is also the basis of Teradata scalability. Each AMP owns an equal slice of the disk:

We will now understand how we can detect skew in a table which has a bad PI and how we can resolve it. Then we will look at how skew can be resolved in joins.

Getting ready

Let's connect to our local database instance and create a table...