Book Image

Data Exploration and Preparation with BigQuery

By : Mike Kahn
Book Image

Data Exploration and Preparation with BigQuery

By: Mike Kahn

Overview of this book

Data professionals encounter a multitude of challenges such as handling large volumes of data, dealing with data silos, and the lack of appropriate tools. Datasets often arrive in different conditions and formats, demanding considerable time from analysts, engineers, and scientists to process and uncover insights. The complexity of the data life cycle often hinders teams and organizations from extracting the desired value from their data assets. Data Exploration and Preparation with BigQuery offers a holistic solution to these challenges. The book begins with the basics of BigQuery while covering the fundamentals of data exploration and preparation. It then progresses to demonstrate how to use BigQuery for these tasks and explores the array of big data tools at your disposal within the Google Cloud ecosystem. The book doesn’t merely offer theoretical insights; it’s a hands-on companion that walks you through properly structuring your tables for query efficiency and ensures adherence to data preparation best practices. You’ll also learn when to use Dataflow, BigQuery, and Dataprep for ETL and ELT workflows. The book will skillfully guide you through various case studies, demonstrating how BigQuery can be used to solve real-world data problems. By the end of this book, you’ll have mastered the use of SQL to explore and prepare datasets in BigQuery, unlocking deeper insights from data.
Table of Contents (21 chapters)
Free Chapter
1
Part 1: Introduction to BigQuery
4
Part 2: Data Exploration with BigQuery
10
Part 3: Data Preparation with BigQuery
14
Part 4: Hands-On and Conclusion

What this book covers

Chapter 1, Introducing BigQuery and Its Components, teaches how BigQuery operates to use it more effectively. We will take an “under the hood” look at the technologies that deliver BigQuery, and understand data exploration and preparation goals.

Chapter 2, BigQuery Organization and Design, teaches how to build a secure and collaborative BigQuery environment. You will gain a strong understanding of all services that deliver the BigQuery service beyond the SQL query. You will also understand design patterns for deploying BigQuery resources.

Chapter 3, Exploring Data in BigQuery, reviews various ways to explore data in BigQuery and reviews the process and steps of data exploration. You will learn about the different methods to access data in BigQuery and best practices to get started.

Chapter 4, Loading and Transforming Data, explores the techniques and best practices for loading data into BigQuery, and reviews the tools and methodologies for transforming and processing data with BigQuery. This chapter includes Hands-on exercise – data loading and transformation in BigQuery.

Chapter 5, Querying BigQuery Data, familiarizes you with the structure of a query and gives you a strong foundation in crafting queries. More complex querying practices will be reviewed as well. This chapter will give you the skills to begin writing queries.

Chapter 6, Exploring Data with Notebooks, helps you understand the value of using notebooks for data exploration and better understand the notebook options in Google Cloud. This chapter includes Hands-on exercise – analyzing Google Trends data in Workbench.

Chapter 7, Further Exploring and Visualizing Data, helps you better understand data attributes, discover patterns, and communicate findings effectively. You will learn about common practices for exploring data and review techniques and tools to analyze and visualize your data. This chapter includes Hands-on exercise – creating visualizations with Looker Studio.

Chapter 8, An Overview of Data Preparation Tools, explores approaches and tools that can be used with BigQuery for data preparation tasks to improve data quality.

Chapter 9, Cleansing and Transforming Data, reviews cleaning and transforming data in greater detail for optimizing table data after loading and initial exploration. You will learn about the skills to handle situations that you will encounter as you refine query results and reporting accuracy.

Chapter 10, Best Practices for Data Preparation, Optimization, and Cost Control, introduces the cost control and optimization features of BigQuery. You will learn how to use BigQuery in a cost-effective way.

Chapter 11, Hands-On Exercise – Analyzing Advertising Data, presents a use case including sales, marketing, and advertising data. Follow along with the exercise to learn how to analyze and prepare advertising data and utilize the steps as a repeatable process with your real data.

Chapter 12, Hands-On Exercise – Analyzing Transportation Data, presents a use case with vehicle data. Follow along with the exercise to learn how to analyze and prepare transportation data; the steps presented can be replicated with real data.

Chapter 13, Hands-On Exercise – Analyzing Customer Support Data, presents a use case with customer support data. Two different customer support data sources will be used, as well as BigQuery ML sentiment analysis, to better understand customer service data.

Chapter 14, Summary and Future Directions, recaps the key points discussed throughout the book. We will look into the future and learn about emerging trends and transformative directions that will shape the landscape of data exploration, preparation, and analytics with BigQuery.