This recipe discusses creating an Incident and Problem Management process.
In Incident Management we focus on restoring a service to its known mode of operation before an unplanned interruption. Problem Management requires you to focus on understanding the actual cause of the interruption with the goal of providing a permanent resolution.
The ITIL© framework books and online resources discuss best practice for Incident and Problem Management processes. You must plan to review and understand Incident and Problem Management principles as a prerequisite to creating the processes.
An example of the steps for creating an Incident and Problem Management process is as follows.
Here are the example steps specific to an Incident Management process:
Agree and document the organization incident management policy.
Document the operational process to support the incident management policy. This should include but not be limited to:
Support hours
Classification categories
Escalation procedures
Create and assign people roles to manage the process. For example:
Service Desk analysts
Desktop support
Infrastructure analyst
Service Desk managers
We typically have two channels for incident management:
Service Desk team-created incidents using the SCSM console.
Sample process steps from incident creation to priority allocation are shown in the following figure:
Automated or end user self-service created incidents (end user web portal, e-mail, or automatic system event driven).
Sample process steps from incident creation to priority allocation are shown in the following screenshot:
The difference between the two typical channels is how the incident is initially categorized (triage). The next step "Process Incident" involves the creation of a process flow to match how the incident management team manage the incident based on your policies and procedures. An example is shown in the following figure:
Monitor and report on the performance of the incident management process. The aim is to improve the process and also identify incidents which require Problem Management.
Here are the example steps specific to a Problem Management process:
Agree and document the organization Problem Management policy.
Document the operational process to support the Problem Management policy.
Create and assign people roles to manage the process. For example:
Problem analysts
Problem managers
Review the Incident Management process with the aim of identifying instances of the following type:
Repeated issues over a defined period (for example, monthly, quarterly, or annually)
Incidents with known workarounds (typically implies there is an opportunity for root cause investigation)
Perform detailed investigation on incidents escalated to Problem management using internal experts or third-party external support.
Create a change request for problems with known permanent fixes.
Incident Management is about getting services that people rely on back to an agreed operational state as soon as possible. An example of Incident Management is a customer who is unable to access their documents:
On investigation we find that the issue is with the laptop assigned to the customer.
We issue the customer with a loan laptop and confirm access to their document.
The previous steps will resolve the incident but we still have a problem. What is wrong with the customer's laptop?
The answer to the question is Problem Management. We use Problem Management to identify the true (root) cause of the issue. Continuing with our scenario from Incident Management:
The desktop engineering team identify the issue as a network hardware device failure in the laptop.
The team also identify that this issue has been happening to a number of laptops over the last quarter.
The team also identify through asset management that we purchase a set of laptops from a vendor and all the issues relate to this set.
We escalate to the vendor and get a driver fix.
A change request is raised to proactively apply the fix to all laptops from the set.
The fix applied to all laptops in scope resolves the issue on the original laptop. We can close the problem and also change the original status of the incident to close. A final best practice will be to create a knowledge article about this known issue and its corresponding fix.
The previous examples illustrate how Incident Management and Problem Management work in practice.