3 (1)

3 (1)

Overview of this book

Understanding and finding patterns in data has become one of the most important ways to improve business decisions. If you know the basics of SQL, but don't know how to use it to gain the most effective business insights from data, this book is for you. SQL for Data Analytics helps you build the skills to move beyond basic SQL and instead learn to spot patterns and explain the logic hidden in data. You'll discover how to explore and understand data by identifying trends and unlocking deeper insights. You'll also gain experience working with different types of data in SQL, including time-series, geospatial, and text data. Finally, you'll learn how to increase your productivity with the help of profiling and automation. By the end of this book, you'll be able to use SQL in everyday business scenarios efficiently and look at data with the critical eye of an analytics professional. Please note: if you are having difficulty loading the sample datasets, there are new instructions uploaded to the GitHub repository. The link to the GitHub repository can be found in the book's preface.
Preface
Free Chapter
1. Understanding and Describing Data
2. The Basics of SQL for Analytics
3. SQL for Data Preparation
4. Aggregate Functions for Data Analysis
5. Window Functions for Data Analysis
6. Importing and Exporting Data
7. Analytics Using Complex Data Types
8. Performant SQL
9. Using SQL to Uncover the Truth – a Case Study

4. Aggregate Functions for Data Analysis

Activity 6: Analyzing Sales Data Using Aggregate Functions

Solution

1. Open your favorite SQL client and connect to the `sqlda` database.
2. Calculate the number of unit sales the company has achieved by using the `COUNT` function:
```SELECT COUNT(*)
FROM sales;```

You should get 37,711 sales.

3. Determine the total sales amount in dollars for each state; we can use the `SUM` aggregate function here:
```SELECT c.state, SUM(sales_amount) as total_sales_amount
FROM sales s
INNER JOIN customers c ON c.customer_id=s.customer_id
GROUP BY 1
ORDER BY 1;```

You will get the following output:

Figure 4.23: Total sales in dollars by US state
4. Determine the top five dealerships in terms of most units sold, using the `GROUP BY` clause and set `LIMIT` as `5`:
```SELECT s.dealership_id, COUNT(*)
FROM sales s
WHERE channel='dealership'
GROUP BY 1
ORDER BY 2 DESC
LIMIT 5```

You should get the following output:

Figure 4.24: Top five dealerships by units sold
5. Calculate...