When planning for High Availability you need to look at every aspect of your infrastructure; spanning from the underlying hardware to the software running on top of the different servers that serve the clients.
Some general points that can be highlighted when setting up a design and that need to be taken into account are as follows:
Is my network adequately built for redundancy and will it be able to service all the different clients with the large amount of incoming data?
Do I have enough storage to store my data and what will happen in case of a disk failure?
Do my servers have enough compute performance to serve the number of clients available, or do I need to roll out more servers or invest in more hardware?
Is my database solution scaled to handle the data flow? What happens if one of the database servers fails?
What will happen if one of the servers in the site suffers a hardware failure?
What happens if any other critical component in our infrastructure fails?
All these questions need to be taken into account and looked over, and taken into the planning phase. We always need to look over a design and think is there any single point of failure with this design? Because, it does not matter if we set up a massive and redundant SQL cluster in every way and we put the cluster on the same network switch. Because, then we know that if that particular switch goes down, the cluster goes down.
Coming back to Configuration Manager let us take a look at a simple site design for Configuration Manager and how it might look:
With a simple design shown in the previous diagram we have the general feature set for Configuration Manager available to our clients. All our Configuration Manager Clients will contact the Management point for policies, advertisements and reporting of data, and so on. The Management point in return will populate the site database with information received from the clients.
When the clients need to download a source file from an advertised deployment or for an operating system deployment it will contact the Distribution point within the site. For this site the data is stored in a single database server, which is collocated with the Primary Site Server. This design also includes a Software Update Point role as well as Endpoint Protection Role for the management of endpoint protection and patch management.
Let us look into problems with this type of design. For instance, let us see what would happen if the Management point server in the site stops functioning:
The clients will try to contact the Management point to get info about policy updates or report in data.
Since the Management point is unavailable, the clients will look at the list of available Management points in the site to see if there are any others available.
Since this site contains only one Management point, it will stop sending data back to the site and will start to cache the data locally and run using the last known configuration.
The clients will do so until the Management point is back online.
Let us see what would happen if we had two Management points in the site we just saw.
The clients would try to contact its first Management point; if it is offline it would look at its list of available Management points and try to contact the other one. This way we would have maintained site functionality for the clients. This gives us a Highly Available Management point solution for the clients, but this is only one of the components that need to be taken into account.
If the database stops working or suffers from a faulty hard drive at the server site, it would reflect outcomes that appear in the upcoming sections. As I mentioned earlier, clients will cache data locally until the site server is restored, but historical data will be lost. For instance, software metering information can be used for reporting licensing usage.
These were just a few examples of what might go wrong with this design. It is important to stay ahead when planning. There are also other components besides the ones we just covered in my example that need to be taken into account and they will be covered later in the book.
A solution such as RAID allows redundancy in case of a disk failure on physical servers, and depending on the RAID level, it might boost the server's performance. If you are unaware of what RAID does, we will go through this in greater detail later in the chapter.
Configuration Manager is highly dependent on a Microsoft SQL Server to store site data and client data. Microsoft has many built-in solutions for High Availability and they will be covered in a later chapter.
In case there are roles that cannot be set as highly available, what options do we have to back up the data and the role information, and how can we restore the service it offers to the users as quickly as possible?
Configuration Manager is highly dependent on other components such as DNS and Active Directory, and also Active Directory Certificate Services and DHCP. Are there any High Availability options for them? I will cover this topic more in detail in a later chapter.
But many of these roles are not a part of the design phase for a Configuration Manager solution, and in most cases are already set-up redundant. Further on in the upcoming chapters we will discuss how we can deploy each role and back-end services like Microsoft SQL Server using High Availability and load-balancing features, SQL Server. It is important to note that there are no services in Configuration Manager that happen in real time and that no clients require continuous communication with any of the site roles.
Configuration Manager always works on a predefined schedule for each operation; therefore you must expect some latency even if you set up High Availability for your sites.
Before we continue, we will take a look at how Microsoft IT deployed Configuration Manager for the environment, just to give a clearer image on how a large enterprise Configuration Manager deployment might look for a business. The following diagram gives its overview design:
Microsoft IT deployed Configuration Manager 2012 for more than 250,000 systems and more than 150,000 users worldwide. Using this design and much of the logic that they use when deploying should be used in other scenarios as well as when planning.
The entire project can be found at the following site:
A point to note here is that there are only six physical servers in the entire design. They are used on the site server roles, which have SQL Server installed (In this case, there is one each in the different Primary Sites and the CAS server).