Book Image

Accelerate DevOps with GitHub

By : Michael Kaufmann
Book Image

Accelerate DevOps with GitHub

By: Michael Kaufmann

Overview of this book

This practical guide to DevOps uses GitHub as the DevOps platform and shows how you can leverage the power of GitHub for collaboration, lean management, and secure and fast software delivery. The chapters provide simple solutions to common problems, thereby helping teams that are already on their DevOps journey to further advance into DevOps and speed up their software delivery performance. From finding the right metrics to measure your success to learning from other teams’ success stories without merely copying what they’ve done, this book has it all in one place. As you advance, you’ll find out how you can leverage the power of GitHub to accelerate your value delivery – by making work visible with GitHub Projects, measuring the right metrics with GitHub Insights, using solid and proven engineering practices with GitHub Actions and Advanced Security, and moving to event-based and loosely coupled software architecture. By the end of this GitHub book, you'll have understood what factors influence software delivery performance and how you can measure your capabilities, thus realizing where you stand in your journey and how you can move forward.
Table of Contents (31 chapters)
1
Part 1: Lean Management and Collaboration
7
Part 2: Engineering DevOps Practices
14
Part 3: Release with Confidence
19
Part 4: Software Architecture
22
Part 5: Lean Product Management
25
Part 6: GitHub for your Enterprise

Measuring metrics that matter

"The key to successful change is measuring and understanding the right things with a focus on capabilities."

– Forsgren. N., Humble, J. & Kim, G. (2018) p. 38

To measure where you are on your transformation journey, it's best to focus on the four metrics that are used in DORA—two for performance and two for stability, as follows:

  • Delivery performance metrics:
    • Delivery lead time
    • Deployment frequency
  • Stability metrics:
    • Mean time to restore
    • Change fail rate

Delivery lead time

The delivery lead time (DLT) is the time from when your engineers start working on a feature until the feature is available to the end users. You could say from code commit to production—but you normally start the clock when the team starts to work on a requirement and changes the state of it to doing or something similar.

It is not easy to get this metric automated from the system. I will show you in Chapter 7, Running Your Workflows, how you can use GitHub Actions and Projects together to automate the metric. If you don't get the metric out of the system, you can set up a survey with the following options:

  • Less than 1 hour
  • Less than 1 day
  • Less than 1 week
  • Less than 1 month
  • Less than 6 months
  • More than 6 months

Depending on where you are on the scale, you conduct the survey more or less often. Of course, system-generated values would be preferable, but if you are on the upper steps of that scale (months), it doesn't matter. It gets more interesting if you measure hours or days.

Why not lead time?

From a Lean management perspective, the LT would be the better metric: how long does a learning from customer feedback flow through the entire system? But requirements in software engineering are difficult. Normally, a lot of steps are involved before the actual engineering work begins. The outcome could vary a lot and the metric is hard to guess if you must rely on survey data. Some requirements could stay for months in the queue—some, only a few hours. From an engineering perspective, it's much better to focus on DLT. You will learn more about LT in Chapter 18, Lean Product Development and Lean Startup.

Deployment frequency

The deployment frequency focuses on speed. How long does it take to deliver your changes? A metric that focuses more on throughput is the DF. How often do you deploy your changes to production? The DF indicates your batch size. In Lean manufacturing, it is desirable to reduce the batch size. A higher DF would indicate a smaller batch size.

At first glance, it looks easy to measure DF in your system. But at a closer look, how many of your deployments really make it to production? In Chapter 7, Running Your Workflows, I will explain how you can capture the metric using GitHub Actions.

If you can't measure the metric yet, you can also use a survey. Use the following options:

  • On-demand (multiple times per day)
  • Between once per hour and once per day
  • Between once per day and once per week
  • Between once per week and once per month
  • Between once per month and once every 6 months
  • Less than every 6 months

Mean time to restore

A good measure for stability is the mean time to restore (MTTR). This measures how long it takes to restore your product or service if you have an outage. If you measure your uptime, it is basically the time span in which your service is not available. To measure your uptime, you can use a smoke test—for example, in Application Insights (see https://docs.microsoft.com/en-us/azure/azure-monitor/app/monitor-web-app-availability). If your application is installed on client machines and not accessible, it's more complicated. Often, you can fall back on the time for a specific ticket type in your helpdesk system.

If you can't measure it at all, you can still fall back to a survey with the following options:

  • Less than 1 hour
  • Less than 1 day
  • Less than 1 week
  • Less than 1 month
  • Less than 6 months
  • More than 6 months

But this should only be the last resort. The MTTR should be a metric you should easily get out of your systems.

Change fail rate

As with DLT for performance, MTTR is the metric for time when it comes to stability. The pendant of DF that focuses on throughput is the change fail rate (CFR). For the question How many of your deployments cause a failure in production?, the CFR is specified as a percentage. To decide which of your deployments count toward this metric, you should use the same definition as for the DF.

The Four Keys dashboard

These four metrics based upon the DORA research are a great way to measure where you are on your DevOps journey. They are a good starting point to change your conversations with management. Put them on a dashboard and be proud of them. And don't worry if you're not yet an elite performer—the important thing is to be on the journey and to improve continuously.

It's very simple to start with survey-based values. But if you want to use automatically generated system data you can use the Four Keys Project to display the data in a nice dashboard, (see Figure 1.4).

Figure 1.4 – The Four Keys dashboard

Figure 1.4 – The Four Keys dashboard

The project is open source and based upon Google Cloud (see https://github.com/GoogleCloudPlatform/fourkeys), but it depends on webhooks to get the data from your tools. You will learn in Chapter 7, Running Your Workflows, how to use webhooks to send your data to the dashboard.

What you shouldn't do

It is important that these metrics are not used to compare teams with each other. You can aggregate them to get an organizational overview, but don't compare individual teams! Every team has different circumstances. It's only important that the metrics evolve in the right direction.

Also, the metrics should not become the goal. It is not desirable to just get better metrics. The focus should always be on the capabilities that lead to these metrics and that we discuss in this book. Focus on these capabilities and the metrics will follow.