DevOps for Databases

By : David Jambor

DevOps for Databases

By: David Jambor

Overview of this book

In today's rapidly evolving world of DevOps, traditional silos are a thing of the past. Database administrators are no longer the only experts; site reliability engineers (SREs) and DevOps engineers are database experts as well. This blurring of the lines has led to increased responsibilities, making members of high-performing DevOps teams responsible for end-to-end ownership. This book helps you master DevOps for databases, making it a must-have resource for achieving success in the ever-changing world of DevOps. You’ll begin by exploring real-world examples of DevOps implementation and its significance in modern data-persistent technologies, before progressing into the various types of database technologies and recognizing their strengths, weaknesses, and commonalities. As you advance, the chapters will teach you about design, implementation, testing, and operations using practical examples, as well as common design patterns, combining them with tooling, technology, and strategies for different types of data-persistent technologies. You’ll also learn how to create complex end-to-end implementation, deployment, and cloud infrastructure strategies defined as code. By the end of this book, you’ll be equipped with the knowledge and tools to design, build, and operate complex systems efficiently.

Preface

Who this book is for

What this book covers

To get the most out of this book

Conventions used

Get in touch

Share Your Thoughts

Download a free PDF copy of this book

Part 1: Database DevOps

Free Chapter

Chapter 1: Data at Scale with DevOps

The modern data landscape

Why speed matters

Data management strategies

The early days of DevOps

SRE versus DevOps

Engineering principles

Objectives – SLOs/SLIs

Summary

Chapter 2: Large-Scale Data-Persistent Systems

A brief history of data

Database evolution

Data warehouses

Data lakes

Summary

Chapter 3: DBAs in the World of DevOps

The continuously evolving role of the DBA

The emergence of cloud computing and big data

DevOps and DBAs

The role of the database expert in a DevOps-conscious team

A proven methodology with quantifiable benefits

Summary

Part 2: Persisting Data in the Cloud

Chapter 4: Cloud Migration and Modern Data(base) Evolution

What is cloud migration (and why are companies doing it)?

Types of cloud migrations

The process of cloud migration

What can a database expert help with during cloud migration?

Data migration strategies and their types

Why are data migration strategies important during a database cloud migration project?

Summary

Chapter 5: RDBMS with DevOps

Embracing DevOps

Summary

Chapter 6: Non-Relational DMSs with DevOps

Activities and challenges

Data modeling

Schema management

Deployment automation

Performance tuning

Data consistency

Security

Anti-patterns (what not to do…)

Summary

Chapter 7: AI, ML, and Big Data

Definitions and applications of AI, ML, and big data

A deep dive into big data as a DevOps data expert

A deep dive into ML as a DevOps data expert

A deep dive into AI as a DevOps data- expert

Summary

Part 3: The Right Tool for the Job

Chapter 8: Zero-Touch Operations

Traditional versus zero-touch approaches

Increased operational efficiency

Improved reliability and consistency

Accelerated deployment and time-to-market

Enhanced scalability and elasticity

Reduced downtime and faster recovery

Improved compliance and security

Sanity-checking our approach

Summary

Chapter 9: Design and Implementation

Designing data-persistence technologies

Implementing data-persistence technologies

Database provisioning and Infrastructure as Code

Database version control and CI/CD

Database performance tuning

Security and compliance

Collaboration and communication

Summary

Chapter 10: Database Automation

Autonomous database management

The revolution of performance tuning – from manual to autonomous

Automated data lineage tracking – a new era of transparency in data management

Data privacy automation – advancing the frontier of privacy compliance in the digital age

Automated data discovery and cataloging – unveiling the hidden treasures in today’s data landscape

The ascendancy of DBaaS – transforming business efficiency and data utilization in the digital age

The emergence of serverless databases – revolutionizing DBaaS through on-demand scalability and cost efficiency

Summary

Part 4: Build and Operate

Chapter 11: End-to-End Ownership Model – a Theoretical Case Study

End-to-end ownership – a case study

Adoption of the end-to-end ownership model

Setting the stage

Design and development phase

Deployment and release

Monitoring and IM

Feedback and iteration

Scaling and challenges

Summary

Chapter 12: Immutable and Idempotent Logic – A Theoretical Case Study

Introduction to immutable and idempotent logic

Immutable logic in data-persisting technologies

Idempotent logic in data-persisting technologies

Practical examples and use cases

Considerations and best practices

Future trends and challenges

Summary

Chapter 13: Operators and Self-Healing Data Persistent Systems

Self-healing systems

Operators in Kubernetes

Self-healing databases

Factors influencing self-healing in different databases

Self-healing in Kubernetes – implementation and best practices

Case studies – self-healing databases in Kubernetes

Benefits of self-healing databases in Kubernetes

Challenges and future directions

Summary

Chapter 14: Bringing Them Together

Alex’s AI journey

Implementation

Observability and operations

Lessons learned and future directions

Summary

Part 5: The Future of Data

Chapter 15: Specializing in Data

Mastering data – bridging the gap between IT and business

My first experience, Unix – 2009

The first signs of DevOps – 2010s

My first SRE team – 2015

Steep learning curves – 2017

Putting it all into practice – 2019

The landscape of 2023 – data and DevOps united

Summary

Chapter 16: The Exciting New World of Data

Part 1 – the future of data-persisting technologies

Part 2 – anticipated changes in AI/ML DevOps

Part 3 – evolving trends in SRE

Part 4 – career outlook and emerging skill sets in SRE

Part 5 – the future of designing, building, and operating cutting-edge systems

Summary

Index

Why subscribe?

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Download a free PDF copy of this book

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Data management strategies

There are many strategies out there, and we will need to use most of them to meet and hopefully exceed our customers’ expectations. Reading this book, you will learn about some of the key data management strategies at length. For now, however, I would like to bring six of these techniques to your attention. We will take a much closer look at each of these in the upcoming chapters:

Bring your data closer: The closer the data is to users, the faster they can access it. Yes, it may sound obvious, but users can be anywhere in the world, and they might even be traveling while trying to access their data. For them, these details do not matter, but the expectation will remain the same.
There are many different ways to keep data physically close. One of the most successful strategies is called edge computing, which is a distributed computing paradigm that brings computation and data storage closer to the sources of data. This is expected to improve response times and save bandwidth. Edge computing is an architecture rather than a specific technology (and a topology), and is a location-sensitive form of distributed computing.

The other very obvious strategy is to utilize the closest data center possible when utilizing a cloud provider. AWS, for example, spans 96 Availability Zones within 30 geographic Regions around the world as of 2022. Google Cloud offers a very similar 106 zones and 35 regions as of 2023.

Leveraging the nearest physical location can greatly decrease your latency and therefore your customer experience.

Reduce the length of your data journey: Again, this is a very obvious one. Try to avoid any unnecessary steps to create the shortest journey between the end user and their data. Usually, the shortest will be the fastest (obviously it’s not that simple, but as a best practice, it can be applied). The greater the number of actions you do to retrieve the required information, the greater computational power you utilize, which directly increases the cost associated with the operation. It also linearly increases the complexity and most of the time increases latency and cost as well.
Choose the right database solutions: There are many database solutions out there that you can categorize based on type, such as relational to non-relational (or NoSQL), the distribution being centralized or distributed, and so on. Each category has a high number of sub-categories and each can offer a unique set of solutions to your particular use case. It’s really hard to find the right tool for the job, considering that requirements are always changing. We will dive deeper into each type of system and their pros and cons a bit later in this book.
Apply clever analytics: Analytical systems, if applied correctly, can be a real game changer in terms of optimization, speed, and security. Analytics tools are there to help develop insights and understand trends and can be the basis of many business and operational decisions. Analytical services are well placed to provide the best performance and cost for each analytics job. They also automate many of the manual and time-consuming tasks involved in running analytics, all with high performance, so that customers can quickly gain insights.
Leverage machine learning (ML) and artificial intelligence (AI) to try to predict the future: ML and AI are critical for a modern data strategy to help businesses and customers predict what will happen in the future and build intelligence into their systems and applications. With the right security and governance control combined with AI and ML capabilities, you can make automated actions regarding where data is physically located, who has access to it, and what can be done with it at every step of the data journey. This will enable you to stick with the highest standards and greatest performance when it comes to data management.
Scale on demand: The aforementioned strategies are underpinned by the method you choose to operate your systems. This is where DevOps (and SRE) plays a crucial part and can be the deciding factor between success and failure. All major cloud providers provide you with literally hundreds of platform choices for virtually every workload (AWS offered 475 instance types at the end of 2022). Most major businesses have a very “curvy” utilization trend, which is why they find the on-demand offering of the cloud very attractive from a financial point of view.

You should only pay for resources when you need them and pay nothing when you don’t. This is one of the big benefits of using cloud services. However, this model only works in practice if the correct design and operational practices and the right automation and compatible tooling are utilized.

A real-life example

A leading telecommunications company was set to unveil their most anticipated device of the year at precisely 2 P.M., a detail well publicized to all customers. As noon approached, their online store saw typical levels of traffic. By 1 P.M., it was slightly above average. However, a surge of customers flooded the site just 10 minutes before the launch, aiming to be among the first to secure the new phone. By the time the clock struck 2 P.M., the website had shattered previous records for unique visitors. In the 20 minutes from 1:50 P.M. to 2:10 P.M., the visitor count skyrocketed, increasing twelvefold.

This influx triggered an automated scaling event that expanded the company’s infrastructure from its baseline (designated as 1x) to an unprecedented 32x. Remarkably, this massive scaling was needed only for the initial half-hour. After that, it scaled down to 12x by 2:30 P.M., further reduced to 4x by 3 P.M., and returned to its baseline of 1x by 10 P.M.

This seamless adaptability was made possible through a strategic blend of declarative orchestration frameworks, infrastructure as code (IaC) methodologies, and fully automated CI/CD pipelines. To summarize, the challenge is big. To be able to operate reliably yet cost-effectively, with consistent speed and security, all the while automatically scaling these services up and down on demand without human interaction in a matter of minutes, you need a set of best practices on how to design, build, test, and operate these systems. This sounds like DevOps.

DevOps for Databases

By : David Jambor

DevOps for Databases

By: David Jambor

Overview of this book

Related Content you might be interested in

Current Title:

DevOps for Databases

The Definitive Guide to Data Integration

The Cloud Computing Journey

Multi-Cloud Handbook for Developers

Data management strategies