Book Image

Google Cloud Platform for Architects

By : Vitthal Srinivasan, Loonycorn , Judy Raj
Book Image

Google Cloud Platform for Architects

By: Vitthal Srinivasan, Loonycorn , Judy Raj

Overview of this book

Using a public cloud platform was considered risky a decade ago, and unconventional even just a few years ago. Today, however, use of the public cloud is completely mainstream - the norm, rather than the exception. Several leading technology firms, including Google, have built sophisticated cloud platforms, and are locked in a fierce competition for market share. The main goal of this book is to enable you to get the best out of the GCP, and to use it with confidence and competence. You will learn why cloud architectures take the forms that they do, and this will help you become a skilled high-level cloud architect. You will also learn how individual cloud services are configured and used, so that you are never intimidated at having to build it yourself. You will also learn the right way and the right situation in which to use the important GCP services. By the end of this book, you will be able to make the most out of Google Cloud Platform design.
Table of Contents (19 chapters)
13
Logging and Monitoring

Underlying data representation of BigQuery

When we load data into BigQuery, each column of that data is stored separately. The values in each column are compressed, run-length encoded, and encrypted, and the corresponding data file is replicated. Each of these replicas is then stored in the underlying distributed filesystem, known as Colossus.

This peculiar representation, columnar, compressed, and replicated, explains a couple of features of BigQuery that otherwise strike us as odd:

  • Does not support indices: This makes it very different from traditional RDBMS. This makes sense, given that each column's data is effectively stored separately anyway, and uses a representation not that different from many indices
  • Cost more for each column they pull in: This also makes sense if you consider that each additional column requires access to a different file in the underlying file...