Book Image

SQL for Data Analytics - Third Edition

By : Jun Shan, Matt Goldwasser, Upom Malik, Benjamin Johnston
Book Image

SQL for Data Analytics - Third Edition

By: Jun Shan, Matt Goldwasser, Upom Malik, Benjamin Johnston

Overview of this book

Every day, businesses operate around the clock, and a huge amount of data is generated at a rapid pace. This book helps you analyze this data and identify key patterns and behaviors that can help you and your business understand your customers at a deep, fundamental level. SQL for Data Analytics, Third Edition is a great way to get started with data analysis, showing how to effectively sort and process information from raw data, even without any prior experience. You will begin by learning how to form hypotheses and generate descriptive statistics that can provide key insights into your existing data. As you progress, you will learn how to write SQL queries to aggregate, calculate, and combine SQL data from sources outside of your current dataset. You will also discover how to work with advanced data types, like JSON. By exploring advanced techniques, such as geospatial analysis and text analysis, you will be able to understand your business at a deeper level. Finally, the book lets you in on the secret to getting information faster and more effectively by using advanced techniques like profiling and automation. By the end of this book, you will be proficient in the efficient application of SQL techniques in everyday business scenarios and looking at data with the critical eye of analytics professional.
Table of Contents (11 chapters)
9
9. Using SQL to Uncover the Truth: A Case Study

Basic Data Types of SQL

Each column in a table has a data type. You will explore the major data types of PostgreSQL here. These types include:

  • Numeric
  • Character
  • Boolean
  • Datetime
  • Data structures (array and JSON)

    Note

    Although the ANSI SQL standard defines a list of data types, different RDBMSs may have their own interpretations and extensions. The data types discussed in this book are based on the PostgreSQL definition. If you use a different RDBMS, you may see some differences in implementation. Furthermore, all RDBMSs, including PostgreSQL, are actively evolving. They constantly add support for new data types, and slightly adjust data type implementations if necessary. So, it is always prudent to use the data type definitions in this book as general guidance and double-check your RDBMS for the exact data type definitions it has.

Numeric

Numeric data types represent numbers. The following figure provides an overview of some of the main types:

Figure 2.28: Major numeric data types

Figure 2.28: Major numeric data types

Character

Character data types store text information. The following figure summarizes character data types:

Figure 2.29: Major character data types

Figure 2.29: Major character data types

Under the hood, all character data types use the same underlying data structure in PostgreSQL (and in many other RDBMSs). The most common character data type is varchar(n).

Boolean

Booleans are a data type used to represent True or False. The following table summarizes values that are represented as Boolean when used in a query with a data column type of Boolean:

Figure 2.30: Accepted Boolean values

Figure 2.30: Accepted Boolean values

While all these values are accepted, the values of True and False are compliant with best practices. Booleans can also take on NULL values.

Datetime

The datetime data type is used to store time-based information, such as dates and times. The following are some examples of datetime data types:

Figure 2.31: Popular datetime data types

Figure 2.31: Popular datetime data types

You will explore this data type further in Chapter 7, Analytics Using Complex Data Types.