In this section, we will review how to use Athena, which is a new service from Amazon that enables SQL queries on S3 files without the need for any additional infrastructure.
For this demonstration, we will use all of the files for CollegeScorecard
from Data.gov as discussed in Chapter 1, A Quick Start to QuickSight, Building your first analysis under 60 seconds section.
Note
The dataset is available from the following public URL https://catalog.data.gov/dataset/college-scorecard.
Here are the detailed steps to upload a file to an S3 filesystem:
Download the
CollegeScorecard_Raw_Data.zip
to your local system (laptop) and unzip the file.To upload the file to AWS S3, login to our account and from the Services menu, select S3.
Select the S3 bucket or create a new S3 bucket. In the following screenshot, I have selected the
collegescorecard
bucket that I created earlier.Create a folder
CollegeRaw
and then subfolders, one for each year 2010...