-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating
Serverless Analytics with Amazon Athena
By :
In Chapter 1, Your First Query, we used TABLESAMPLE to run a query that allowed us to get familiar with our data by viewing an evenly distributed sampling of rows from across the entire table. TABLESAMPLE enables you to approximate the results of any query by sampling the underlying data. Athena also supports more targeted forms of approximation that offer bounded error. For example, the approx_distinct function should produce results with a standard error of 2.3% but completes its execution 97% faster while also using less peak memory than its completely accurate counterpart, COUNT(DISTINCT x). We'll learn more about these and several other approximate query tools by exploring our NYC taxi ride tables.
TABLESAMPLE is a somewhat generic technique for running approximate queries. Unlike the other methods we discuss in this section, TABLESAMPLE works by sampling the input data. This allows you to use it in conjunction with any other SQL features supported...