Network Enhancers - "Delivering Beyond Boundaries" Headline Animator

Sunday, February 20, 2011

Dealing with Major Incidents

We know that the primary goal of the incident management process is to restore normal service operations as quickly as possible and to minimize any adverse impact on business operations. This will insure the highest levels of service quality and availability are delivered to the user community, guaranteeing that the business is receiving value and facilitating the outcomes it wants to achieve.

The value this process produces for the business is in the ability to:
  • detect and resolve incidents quickly, resulting in higher availability of IT services.
  • align IT activities to real time business priorities and dynamically allocate resources as necessary.
  • identify potential improvements to services, through the analysis of incident trends.
So it sounds like we have everything covered as long as we handle all incidents in the same consistent and proceduralized manner. Well not so fast. What happens when we have an incident that affects a major business process and in turn creates a major impact to the business?

For these types of situations we need to have a separate procedure, with shorter escalation time scales and greater urgency in responding to “Major Incidents”. First we must agree on a definition of just what constitutes a major incident and how it will be integrated into the overall incident prioritization system.

Note: Many organizations that I have corresponded with confuse this separate process with problem management. A major incident may increase in impact to the business thus increasing in the priority it needs to be addressed by the ITSM processes but it still remains an incident and never becomes a problem.

Where necessary , the major incident procedure should include the formation of a separate and dynamic major incident team (under the leadership of the incident manager) to concentrate their efforts on the particular incident alone and insure that adequate resources are engaged and solely focused on providing a swift resolution to the impact at hand. Problem management can be involved if the underlying cause needs to be discovered at the same time, but the incident manager must ensure that restoration of services and root cause analysis are kept separate and that impact reduction is the priority.

No comments:

Post a Comment

My Blog List

Networking Domain Jobs