Book Image

Google Cloud Platform for Architects

By : Vitthal Srinivasan, Loonycorn , Judy Raj
Book Image

Google Cloud Platform for Architects

By: Vitthal Srinivasan, Loonycorn , Judy Raj

Overview of this book

Using a public cloud platform was considered risky a decade ago, and unconventional even just a few years ago. Today, however, use of the public cloud is completely mainstream - the norm, rather than the exception. Several leading technology firms, including Google, have built sophisticated cloud platforms, and are locked in a fierce competition for market share. The main goal of this book is to enable you to get the best out of the GCP, and to use it with confidence and competence. You will learn why cloud architectures take the forms that they do, and this will help you become a skilled high-level cloud architect. You will also learn how individual cloud services are configured and used, so that you are never intimidated at having to build it yourself. You will also learn the right way and the right situation in which to use the important GCP services. By the end of this book, you will be able to make the most out of Google Cloud Platform design.
Table of Contents (19 chapters)
13
Logging and Monitoring

Understand the unified architecture for batch and stream

More and more big data applications rely on streaming data. There are many reasons for this: notably the increasing need for real-time insights where a system must output analytics as new data comes in on-the-fly. We will not spend a lot of time discussing the difference between batch and streaming data, but intuitively, batch data is at rest in a database or a file, whereas streaming data is, well, streaming from a source to a sink.

There is a specific architecture that Google mentions a lot, which combines batch and stream processing into a single pipeline, and it is worth our understanding this architecture, as follows:

In the GCP word, the most common batch data source is GCS (that is, buckets) and the reliable messaging layer is Pub/Sub. Pub/Sub virtually always feeds into Dataflow, which is based on the Apache Beam...