Book Image

Connecting the Data: Data Integration Techniques for Building an Operational Data Store (ODS)

By : Angelo Bobak
Book Image

Connecting the Data: Data Integration Techniques for Building an Operational Data Store (ODS)

By: Angelo Bobak

Overview of this book

When organizations change or enhance their internal structures, business data integration is a complex problem that they must resolve. This book describes the common hurdles you might face while working with data integration and shows you various ways to overcome these challenges. The book begins by explaining the foundational concepts of ODS. Once familiar with schema integration, you?ll learn how to reverse engineer each data source for creating a set of data dictionary reports. These reports will provide you with the metadata necessary to apply the schema integration process. As you progress through the chapters, you will learn how to write scripts for populating the source databases and spreadsheets, as well as how to use reports to create Extract, Transform, and Load (ETL) specifications. By the end of the book, you will have the knowledge necessary to design and build a small ODS.
Table of Contents (17 chapters)
Free Chapter
1
Section 1: Site Reliability Engineering – A Prescriptive Way to Implement DevOps
6
Section 2: Google Cloud Services to Implement DevOps via CI/CD
Appendix: Getting Ready for Professional Cloud DevOps Engineer Certification

Points to remember

The following are some important points to remember:

  • A node in a Kubernetes cluster is categorized as a master or worker node. The master node runs the control plane components.
  • The key components of the Kubernetes control plane are kube-apiserver, etcd, kube-scheduler, kube-controller-manager, and cloud-controller-manager.
  • It is recommended to run the control plane components on the same node and avoid any user-specific containers on that node.
  • A highly available cluster can have multiple control planes.
  • kube-apiserver handles any queries or changes to the cluster and can be horizontally scaled.
  • etcd is a distributed key-value store used by Kubernetes to store cluster configuration data.
  • kube-scheduler chooses a suitable node where an application can be deployed.
  • kube-controller-manager runs several controller functions to ensure that the current state of the cluster matches the desired state.
  • cloud-controller-manager includes...