Querying S3 data using Athena
Athena is a serverless service designed for querying data stored in S3. It is serverless because the client doesn't manage the servers that are used for computation:
- Athena uses a schema to present the results against the query on the data stored in S3. You define how you want your data to appear in the form of a schema and Athena reads the raw data from S3 to show the results as per the defined schema.
- The output can be used by other services for visualization, storing, or various analytics purposes. The source data in S3 can be in any of the following structured, semi-structured, and unstructured data formats, including XML, JSON, CSV/TSV, AVRO, Parquet, ORC, and more. CloudTrail, ELB Logs, and VPC flow logs can also be stored in S3 and analyzed by Athena.
- This follows the schema-on-read technique. Unlike traditional techniques, tables are defined in advance in a data catalog, and data is projected when it reads. SQL-like queries...