-
Book Overview & Buying
-
Table Of Contents
Optimizing Microsoft Azure Workloads
By :
Cost optimization, operational excellence, performance efficiency, reliability, and security are the five pillars of the WAF. When it comes to the elements of the WAF, this is different from the pillars. If we place the WAF in the center, then we have six supporting elements. These elements support the pillars with the principles and datasets required for the assessment.
As you know, the WAF is a set of best practices developed by Microsoft; these best practices are further categorized into five interconnected pillars. Now, the question is: Where exactly are these best practices inscribed? In other words, the practices should be developed first before we can categorize them into different pillars. This is where the elements come into the picture. The elements act as a stanchion for the pillars.
As per Microsoft’s documentation, the supporting elements for the WAF are the following:
Now, we will see the explanation of each of these elements. Let’s start with the Azure Well-Architected Review.
Assessment of the workload is required for the creation of the remediation plan; the assessment is inevitable. In the Well-Architected Review, there will be a set of questions prepared by Microsoft to understand the processes and practices in your environment. There will be a separate questionnaire for each pillar of the WAF. For example, the questionnaire for cost optimization will contain questions related to Azure Reserved Instances, tagging, Azure Hybrid Benefit, and so on. Meanwhile, the operational excellence questionnaire will have questions related to DevOps practices and approaches. There will be different possible answers to these questions, varying from recommended methods to non-recommended methods. Customers can answer based on their environment, and the system will generate a plan with recommendations that can be implemented to make their environment aligned with the WAF.
The review can be taken by anyone from the Microsoft Assessments portal (https://docs.microsoft.com/en-us/assessments/?mode=home). In the portal, you must select Azure Well-Architected Review, as shown in the following screenshot:
Figure 1.2 – Accessing Microsoft Assessments
Once you select Azure Well-Architected Review, you will be presented with a popup asking whether you want to create a new assessment or create a milestone. If you want to create a new assessment, then you can go for New Assessment, or choose Create a milestone for an existing assessment. At this point, we will conduct an assessment; nevertheless, each pillar of the WAF has its own dedicated chapter, and we will perform the assessment there.
With that, we will move on to the next element of the framework, which is Azure Advisor.
If you have worked on Microsoft Azure, you will know that Azure Advisor is the personalized cloud consultant developed by Microsoft for you. Azure Advisor can generate recommendations for you, and you can leverage this tool to improve the quality of workloads. Looking at Figure 1.3, we can see that the recommendations are categorized into different groups, and the group names are the same as the pillars of the WAF:
Figure 1.3 – Azure Advisor
With the help of Azure Advisor, you can do the following:
Advisor has a score based on the number of actionable recommendations; this score is called Advisor Score. If the score is lower than 100%, that means there are recommendations, and we need to remediate them to improve the score. As you can see in Figure 1.3, the Advisor Score total for the environment is 81%, and the Score by category values are on the right side.
The good thing about Azure Advisor is that recommendations will be generated as soon as you start using the subscription. You don’t have to deploy any agents, make any additional configurations, or pay to use the Advisor service. The recommendations are generated with the help of machine learning (ML) algorithms based on usage, and they will also be refreshed periodically. Advisor can be accessed from the Azure portal, and it has a rich REST API if you prefer to retrieve the recommendations programmatically and build your own dashboard.
In the coming chapters, we will be relying a lot on Azure Advisor for collecting recommendations for each of the pillars.
Now that we have covered the second element of the WAF, let’s move on to the next one.
Microsoft’s documentation has done an excellent job of helping people who are new to Azure. All documentation related to the WAF is documented at https://docs.microsoft.com/en-us/azure/architecture/framework/. As a matter of fact, this book is a demystified version of this documentation with additional examples and real-world scenarios.
As with all documentation, the WAF documentation is lengthy and refined, but for a beginner, the amount of information in the documentation can be overwhelming. This book distills the key insights and essentials from the documentation, providing you with everything you need to get started. The following screenshot shows the documentation for the framework:
Figure 1.4 – WAF documentation
As you can see in the preceding screenshot, the contents are organized according to the pillars, and finally, the documentation is concluded with steps to implement the recommendations. You could call this the Holy Bible of WAF. Everything related to the WAF is found in this documentation and we would strongly recommend bookmarking the link to stay updated.
All documentation for Azure is available at https://docs.microsoft.com/en-us/azure/?product=popular. The documentation covers how to get started, the CAF, and the WAF, and includes learning modules and product manuals for every Azure service. Apart from the documentation, this site offers sample code, tutorials, and more. Regardless of the language you write your code in, Azure documentation provides SDK guides for Python, .NET, JavaScript, Java, and Go. On top of that, documentation is also available for scripting languages such as PowerShell, the Azure CLI, and infrastructure as code (IaC) solutions such as Bicep, ARM templates, and Terraform.
Deploying complex solutions by adhering to the best practices can be challenging for new customers. This is where we can rely on Microsoft partners. The Microsoft Partner Network (MPN) is massive, and you can leverage Azure partners for technical assistance and support to empower your organization. You can find Azure partners and Azure Expert Managed Service Providers (MSPs) at https://azure.microsoft.com/en-us/partners/. MSPs can aid with automation, cloud operations, and service optimization. You can also seek assistance for migration, deployment, and consultation. Based on the service you are working with and the region you belong to, you can find a partner with the required skills closer to you.
Once the partner deploys the solution, there will be break-fix issues that you need assistance with. Microsoft Support can help you with any break-fix scenarios. For example, if one of your VMs is unavailable or a storage account is inaccessible, you can open a support request. Billing and subscription support is free of cost and does not require you to purchase any support plans. However, for technical assistance, you need to purchase a support plan. A quick comparison of these plans is shown in the following table:
|
Basic |
Developer |
Standard |
ProDirect |
|
|
Price |
Free |
$29/month |
$100/month |
$1,000/month |
|
Scope |
All Azure customers |
Trial and non-production environments |
Production workloads |
Mission-critical workloads |
|
Billing support |
Yes |
Yes |
Yes |
Yes |
|
Number of support requests |
Unlimited |
Unlimited |
Unlimited |
Unlimited |
|
Technical support |
No |
Yes |
Yes |
Yes |
|
24/7 support |
N/A |
During business hours via email only |
Yes (email/phone) |
Yes (email/phone) |
Table 1.1 – Comparison of Azure support plans
A full comparison is available at https://azure.microsoft.com/en-us/support/plans/. Basic support can only open Severity C cases with Microsoft Support. In order to open Severity B or Severity A cases, you must have a Standard or ProDirect plan. Severity C has an SLA of 8 business hours and is recommended for issues with minimal business impact, while Severity B is for moderate impact with an SLA of 4 hours. If the case opened is a Severity A case, then the SLA is 1 hour. Severity A is reserved for critical business impact issues where production is down. Having a ProDirect plan offers extra perks to customers, such as training, a dedicated ProDirect manager, and operations support. The ProDirect plan also has a Support API that customers can use to create support cases programmatically. For example, if a VM is down, by combining the power of Azure alerts and action groups, we can make a call to the Support API to create a request automatically.
In addition to these plans, there is a Unified/Premier contract that is above the ProDirect plan and is ideal for customers who want to cover Azure, Microsoft 365, and Dynamics 365. Microsoft support is available in English, Spanish, French, German, Italian, Portuguese, traditional Chinese, Korean, and Japanese to support global customers. Keep in mind that the plans cannot be transferred from one customer to another. Based on your requirement, you can purchase a plan and you will be charged every month.
Service offers deal with different subscription types for customers. There are different types of Azure subscriptions having different billing models. A complete list of available offers is listed at https://azure.microsoft.com/en-in/support/legal/offer-details/. When it comes to organizations, the most common options are Enterprise Agreement (EA), Cloud Solution Provider (CSP), and Pay-As-You-Go; these are commercial subscriptions. Organizations deploy their workloads in these subscriptions, and they will be charged based on consumption. How they get charged depends solely on the offer type. For example, EA customers make an upfront payment and utilize the credits for Azure; any charges above the credit limit will be invoiced as an overage. Both Pay-As-You-Go and CSP will get monthly invoices. In CSP, an invoice will be generated by the partner; however, in Pay-As-You-Go, the invoice comes directly from Microsoft.
There are other types of subscriptions used for development, testing, and learning purposes, such as Visual Studio subscriptions, Azure Pass, Azure for Students, the Free Trial, and so on. However, these are credit-based subscriptions, and they are not backed up by the SLAs. Hence, these cannot be used for hosting production workloads.
The next element we are going to cover is reference architecture.
If you know coding, you might have come across a scenario where you are not able to resolve a code error and you find the solution from Stack Overflow or some other forum. Reference architecture serves the same purpose, whereby Microsoft provides guidance on how the architecture should be implemented. With the help of reference architecture, we can design scalable, secure, reliable, and optimized applications by taking a defined methodology.
Reference architecture is part of the application architecture fundamentals. The application architecture fundamentals comprise a series of steps where we will decide on the architecture style, technology, architecture, and—finally—alignment with the WAF. This will be used for developing the architecture, design, and implementation. The following diagram shows the series of steps:
Figure 1.5 – Application architecture fundamentals
In the preceding diagram, you can see that the first choice is the architectural style, and this is the most fundamental thing we must decide on. For example, we could take a three-tier application approach or go for microservices architecture.
Once that’s decided, then the next decision is about the services involved. Let’s say your application is a three-tier application and has a web frontend. This frontend can be deployed in Azure Virtual Machines, Azure App Service, Azure Container Instances, or even Azure Kubernetes Service (AKS). Similarly, for the data store, we can decide whether we need to go for a relational or non-relational database. Based on your requirements, you can select from a variety of database services offered by Microsoft Azure. Likewise, we can also choose the service that will host the mid-tier.
After selecting the technology, we need to choose the application architecture. This is the stage at which we decide how the architecture is going to be in the following stages and select the style and services we are going to use. Microsoft has several design principles and reference architectures that can be leveraged in this stage. We will cover the design principles in the next section.
The reference architectures can be accessed from https://docs.microsoft.com/en-us/azure/architecture/browse/?filter=reference-architecture, and this is a good starting point to begin with the architecture for your solution. You might get an exact match as per your requirement; nevertheless, we can tweak these architectures as required. Since these architectures are developed by Microsoft by keeping the WAF pillars in mind, you can deploy with confidence as these solutions are scalable, secure, and reliable. The following screenshot shows the portal for viewing reference architectures:
Figure 1.6 – Browsing reference architectures
The portal offers filtering on the type of product and categories. From hundreds of reference diagrams, you can filter and find the one that matches your requirements. For example, a simple search for 3d video rendering returns two reference architectures, as shown in the following screenshot:
Figure 1.7 – Filtering reference architectures
Clicking on the reference architecture takes you to a complete explanation of the architecture components, data flow, potential use cases, considerations, and best practices aligned with the WAF. The best part is you will have the Deploy to Azure button, which lets you directly deploy the solution to Azure. The advantage is the architecture is already aligned with the WAF and you don’t have to spend time assessing the solution again.
With that, let’s move on to the last element of the WAF—design principles.
In Figure 1.5, we saw that reference diagrams and design principles are part of the third stage of application architecture fundamentals. In the previous section, we saw how we can use the reference architecture, and now we will see how to leverage the design principles. There are 11 design principles you should incorporate into your design discussions. Let’s understand each of the design principles.
As with on-premises, failures can happen in the cloud as well. We need to acknowledge this fact; the cloud is not a silver bullet for all the issues that you faced on-premises but does offer massive advantages compared to on-premises infrastructure. The bottom line is failures can happen, hardware can fail, and network outages can happen. While designing our mission-critical workloads, we need to anticipate this failure and design for healing. We can take a three-branched approach to tackle the failure:
The way you want to respond to failures will entirely depend on your services and the availability requirements. For example, you have a database and would like to failover to a secondary region during the primary region failover. Setting up this replication will sync your data to a secondary region and failover whenever the primary region fails to serve the application. Keep in mind that replicating data to another region can be more expensive than having a database with a single region.
Regional outages are generally uncommon, but while designing for healing, you should also consider this scenario. Your focus should be on handling hardware failures, network outages, and so on because they are very common and can affect the uptime of your application. There are recommendations provided by Microsoft on how to design for healing—these are called design patterns. The recommended patterns are presented here:
As mentioned at the beginning of this chapter, design patterns are not within the scope of this book. Again, thanks to Microsoft, all patterns are listed at https://docs.microsoft.com/en-us/azure/architecture/patterns/. Let’s move on to the next design principle.
SPOFs in architecture can be eliminated by having redundancy. Earlier, we discussed RAID storage in the Reliability subsection of the What are the pillars of the WAF? section, where multiple disks are used to improve data redundancy. Azure has different redundancy options based on the service that you are using. Here are some of the recommendations:
With that, let’s learn about the next design principle—minimize coordination.
This principle applies to Storage, SQL Database, and Cosmos DB where we diminish the coordination between application services to accomplish scalability. The key concepts of this design principle are mostly aligned with some data concepts that are not in the scope of this book. The following are recommendations provided by Microsoft for this design principle:
An in-depth explanation of these recommendations is available at https://docs.microsoft.com/en-us/azure/architecture/guide/design-principles/minimize-coordination.
In on-premises, one of the main issues is the capacity constraint. Traditional data centers had capacity issues, and when it comes to the cloud, the advantage is that it offers elastic scaling. In simpler terms, we can provision workloads as required without the need to pre-provision or buy capacity. Talking of scaling, we have two types of scaling, as follows:
Now that we know the types of scaling, as the name suggests, we need to design for scaling out so that the instances are automatically increased based on the demand. The following recommendations are provided for this design principle:
Now that you are familiar with the scale-out design, let’s shift the focus to the next item on the list.
In Azure, we have limits for each resource. Some of the limits are hard limits, while others are soft limits. If the limit is a soft limit, we can reach out to Microsoft Support and increase the limit as required. When it comes to scaling, there is also a limit imposed by Microsoft for every resource. If your system is growing tremendously, you will eventually reach the upper limit of the resource. These limits include the number of compute cores, database size, storage throughput, query throughput, network throughput, and so on. In order to efficiently overcome the limits, we need to use partitioning. Earlier, we discussed how we can use data partitioning to improve the scalability and availability of data. Similarly, we can use partitioning to work around resource limits.
There are numerous reasons a system can be partitioned to avoid limits, such as the following:
In the case of databases, we can partition vertically, horizontally, or functionally. Just to give you an idea, let’s have a closer look at this:
A full list of recommendations is available here: https://docs.microsoft.com/en-us/azure/architecture/guide/design-principles/partition. The next design principle we are going to cover is design for operations.
With the cloud transformation, the regular IT chorus of managing hardware and data center is long gone. The IT is no longer responsible for the data center management as it will be handled by the cloud provider. Having said that, the IT team or the operations team is still responsible for deploying, managing, and administering the resources deployed in the cloud. Some key areas that the operations team should handle include the following:
A list of recommendations shared by Microsoft can be reviewed at https://docs.microsoft.com/en-us/azure/architecture/guide/design-principles/design-for-operations. With that, we will move on to the next design principle.
Unlike on-premises, the cloud offers different service models such as IaaS, PaaS, and Software-as-a-Service (SaaS). Here, we will discuss IaaS and PaaS as SaaS is more of a solution where the end customer doesn’t manage the code and is managed by the cloud provider.
In IaaS, the cloud provider takes care of the infrastructure (physical servers, network, storage, hypervisor, and so on) and the customer can create a VM on top of this hardware. Microsoft is not responsible for maintaining the VM OS; it will be the duty of the customer to update, patch, and maintain the OS and code of the application. In contrast, in PaaS, the cloud provider provides a hosting environment where the infrastructure, OS, and framework are managed by Microsoft. The only thing that the customer needs to do is push their code to the PaaS service, and it’s up and running. Developers can be more productive and write their code without the need to worry about the underlying hardware or its maintenance.
The design principle recommends using PaaS services instead of IaaS whenever possible. IaaS is only recommended if you require more control over the infrastructure, but if you simply require a reliable environment and ease of management, then PaaS is right for you. Table 1.2 shows some of the IaaS replacements for popular caches, queues, databases, and web solutions in Azure:
|
Instead of running (IaaS) |
Consider deploying (PaaS) |
|
Active Directory |
Azure AD |
|
RabbitMQ |
Azure Service Bus |
|
SQL Server |
SQL Database |
|
Hadoop |
Azure HDInsight |
|
PostgreSQL/MySQL |
Azure Database for PostgreSQL/Azure Database for MySQL |
|
IIS/Apache/NGINX |
Azure App Service |
|
MongoDB/Cassandra/Gremlin |
Cosmos DB |
|
Redis |
Azure Cache for Redis |
|
File Share |
Azure File Share/Azure NetApp Files |
|
Elasticsearch |
Azure Cognitive Search |
Table 1.2 – IaaS-to-PaaS considerations
This is not a complete list; there are different ways by which you can replace VMs (IaaS) with platform-managed services. Speaking of services, let’s discuss identity services, which are the subject of the next design principle.
This is often considered a subsection of the previous design principle; however, there are some additional key points that we need to cover as part of the identity solution. Every cloud application needs to have user identities. Due to this reason, Microsoft recommends using an Identity-as-a-Service (IDaaS) solution rather than developing your own identity solution. In Azure, we can use Azure AD or Azure AD B2C as an identity solution for managing users, groups, and authentication.
The following recommendations are shared by Microsoft for this design principle:
With that, we will discuss the next design principle.
Most organizations use relational SQL databases for persisting applications. These databases for good for transactions that contain relational data. Keep the following considerations in mind if your preferred option is a relational database:
The recommendation is not to use a relational database for every scenario. There are other alternatives, such as the following:
Choose one based on the type of data that your application handles. For example, if your application handles rain-sensor data, which is basically a time series, then you should go for a time-series database rather than using a relational database. Similarly, if you want to have a product catalog for your e-commerce application, each product will have its own specification. The specifications of a smartphone include brand, processor, memory, and storage, while the specifications of a hair dryer are completely different. Here, we need to store the details of each product as a document, and these will be retrieved when the user clicks on the item. For these kinds of scenarios, you should use a document database. In Azure, this type of product catalog can be stored in Azure Cosmos DB.
To conclude, a relational database is not meant for every scenario; consider using alternatives depending on the data that your application wants to store.
We have two more design principles to be covered before we wrap up, so let’s move on to the next one.
According to Charles Darwin’s theory of evolution, species change over time, give rise to new species, and share a common ancestor. The theory also looks at natural selection, which causes the population to adapt or get accustomed to the environment. Keeping this theory in mind, when you design applications, design for evolution. This design principle talks about the transformation from a monolithic to a microservices architecture. This transformation is more of an evolution to eliminate tight coupling between application components, which makes the system more inflexible and weaker.
Microservices architecture decouples the application components, and they are loosely coupled. If they are closely packed, the changes in one component will create repercussions in another one. This makes it very difficult to launch new changes into the system. To avoid this, we can consider a microservices architecture, where we can issue changes to the system without affecting other services.
A list of recommendations for this design principle is available at https://docs.microsoft.com/en-us/azure/architecture/guide/design-principles/design-for-evolution.
Now, we are going to discuss the last design principle. Let’s dive right in!
All the principles we discussed so far are driven by a common factor: business requirements. For example, when we discussed the Make all things redundant design principle, we explored different recommendations for setting up redundant infrastructure. But what if the workload that I have is a proof of concept (POC) or development workload? Do I need to have redundant VMs for a development workload? As you can imagine, development workloads don’t require redundant VMs unless this is demanded by the key factor—business requirements. It might seem apparent, but everything boils down to business requirements.
Leverage the following recommendations to build solutions to meet business needs:
That was the last design principle, and it’s a wrap-up. We have finally completed the elements of the WAF.
Change the font size
Change margin width
Change background colour