Summary
In this chapter, you concluded your introduction to Athena by getting hands-on with the key features that will allow you to use Athena for many everyday analytics tasks. We practiced queries and techniques that add new data, either in bulk via CTAS
or incrementally through INSERT INTO
, to our data lake. Our exercises also included experiments with approximate query techniques that improve our ability to find insights in our data. Features such as TABLESAMPLE
or approx_percentile
allow us to trade query accuracy for reduced cost or shorter runtimes. Cheaper and faster exploration queries enable us to consult the data more often. This leads to better decision-making and less reluctance to run long or expensive queries because you proved their worth with a shorter, approximate query. This may be hard to imagine given that all the queries in this chapter took less than a minute to run and, in aggregate, cost less than USD 1. In practice, many fascinating queries can take hours...