Book Image

SQL for Data Analytics. - Third Edition

By : Jun Shan, Matt Goldwasser, Upom Malik
Book Image

SQL for Data Analytics. - Third Edition

By: Jun Shan, Matt Goldwasser, Upom Malik

Overview of this book

Every day, businesses operate around the clock, and a huge amount of data is generated at a rapid pace. This book helps you analyze this data and identify key patterns and behaviors that can help you and your business understand your customers at a deep, fundamental level. SQL for Data Analytics, Third Edition is a great way to get started with data analysis, showing how to effectively sort and process information from raw data, even without any prior experience. You will begin by learning how to form hypotheses and generate descriptive statistics that can provide key insights into your existing data. As you progress, you will learn how to write SQL queries to aggregate, calculate, and combine SQL data from sources outside of your current dataset. You will also discover how to work with advanced data types, like JSON. By exploring advanced techniques, such as geospatial analysis and text analysis, you will be able to understand your business at a deeper level. Finally, the book lets you in on the secret to getting information faster and more effectively by using advanced techniques like profiling and automation. By the end of this book, you will be proficient in the efficient application of SQL techniques in everyday business scenarios and looking at data with the critical eye of analytics professional.
Table of Contents (11 chapters)
9
9. Using SQL to Uncover the Truth: A Case Study

Loading the Sample Datasets – Windows

Most exercises in this book use a sample database, sqlda, which contains fabricated data for a fictional electric vehicle company called ZoomZoom. Set it up by performing the following steps:

  1. First, create a database titled sqlda. Open the command line and type or paste the following command. Then, press the return key to execute it:
    createdb -U postgres sqlda

You will be prompted to enter the password that you set for the postgres superuser during installation:

Figure 0.22: PostgreSQL shell password request

Figure 0.22: PostgreSQL shell password request

  1. To check whether the database has been successfully created, log in to the shell by typing or pasting the following command and pressing the return key:
    psql -U postgres

Enter your password when prompted. Press the return key to proceed.

  1. Type \l (a backslash and a lowercase L) and then press the return key to check if the database has been created. The sqlda database should appear along with a list of the default databases:
Figure 0.23: PostgreSQL list of databases

Figure 0.23: PostgreSQL list of databases

  1. Download the data.dump file from the Datasets folder in the GitHub repository of this book by clicking this link: http://packt.link/GuU31. Modify the highlighted path in the following command based on where the file is located on your system. Type or paste the command into the command line and press the return key to execute it:
    psql -U postgres -d sqlda -f C:\<path>\data.dump

    Note

    Alternatively, you can use the command line and navigate to the local folder where you have downloaded the file using the cd command. For example, if you have downloaded it to the Downloads folders of your computer, you can navigate to it using cd C:\Users\<your username>\Downloads. In this case, remove the highlighted path prefix in the step. The command should look like this: psql -U postgres -d sqlda -f data.dump.

You should get an output similar to the one that follows:

Figure 0.24: PostgreSQL database import

Figure 0.24: PostgreSQL database import

  1. Check whether the database has been loaded correctly. Log in to the PostgreSQL console by typing or pasting the following command. Press the return key to execute it:
    psql –U postgres

In the shell, type the following command to connect to the sqlda database:

\c sqlda

Then type \dt. This command should list all the tables in the database, as follows:

Figure 0.25: Validating that the database has been imported

Figure 0.25: Validating that the database has been imported

Note

You are importing the database using the postgres superuser for demonstration purposes only. It is advised in production environments to use a separate account.