|
|
Service Management
-
TimelessMIND Enterprise Problem Manager
Problem Management examines the underlying or root cause of an incident, and aims to prevent incidents with similar symptoms from occurring in the future. By removing errors, which often requires a structural change to the IT Infrastructure in an organization, the number of incidents can be reduced over time.
The goal of Problem Management is to resolve the root cause of incidents and thus to minimize the adverse impact of incidents and problems on business that are caused by errors within the IT Infrastructure, and to prevent recurrence of incidents related to these errors. A ‘problem’ is an unknown underlying cause of one or more incidents, and a ‘known error’ is a problem that is successfully diagnosed and for which either a work-around or a permanent resolution has been identified.
Problem as defined by ITIL
A problem is a condition often identified as a result of multiple Incidents that exhibit common symptoms. Problems can also be identified from a single significant Incident, indicative of a single error, for which the cause is unknown, but for which the impact is significant.
Known Error as defined by ITIL
A known error is a condition identified by successful diagnosis of the root cause of a problem, and the subsequent development of a Work-around.
Implementing a Incident and Problem Management Platform
By successfully implementing an Incident and Problem Management Platform to track incidents, it will provide a complete audit trail and integration between the fault or disruption of normal operation. This is known as Incident Management, to the state that it indicates an error in the infrastructure, known as Problem Management, and the problem for which the cause is found, which is then referred to as a Known Error. This is crucial in determining the underlying problem and resolving the one or many Incidents caused by the problem.
Problem Management Lifecycle
When implementing a Problem Management Platform for supporting the Problem Management cycle the ability to manage a problem through the entire lifecycle is critical. As mentioned in Incident management, the need to manage a problem through the entire life cycle is important because it allows important data to be collected. One of the most important features of Problem Management software is that it provides an integrated and searchable knowledge base that can be populated with common solutions and work-arounds to known problems. The data collected here can be used to solve new problems that arise with common symptoms. The data collected in these systems also allow a company to manage and report on the Key Performance Indicators (KPIs) such as the number of incidents by category, resolution, priority and service level agreement, etc.
ITIL and its Impact of Problem Management
By enforcing the standardized methods and procedures it ensures that incidents are handled efficiently and promptly to reduce the impact on the company. An automated escalation system will prioritize and route incidents according to specific requirements, for instance, the stage of the life cycle it occurred in, the impact on the service, the amount of time taken to correct the incident, etc. This data enables a company to track the incidents which provides the data to find a work-around or a solution to the incident. Collecting this data enables resolutions to be made early on problem trends.
Problem Management versus Incident Management
Problems can be managed and data can be collected and stored for future references. Problem Management differs greatly from Incident Management in ITIL. Incident Management is designed to bring a company’s service back up in accordance with their service level agreement, and find a way for the company to continue their practices with minimal adverse affects on their business.
Problem Management finds solutions to problems. Problems are created from one or more incidents with common symptoms. To find a viable solution to an Incident typically a change to the company IT infrastructure is made. Errors are removed and data is collected on all stages from the initial incident to the final change resulting in a solution to the error. Problem Management takes more time to carry out and is designed for long term gain. Incident Management is designed for short term success until a long term solution can be accomplished by implementing a change.
Documentation of Problem Management
Audit trails are tracked in detail for all Problems. Audit trails are a source of precise data collected from the initial incident, to the problem to the solution. The data is collected and stored in the CMDB (Configuration Management Database) for many reasons. This database of information makes identifying common symptoms of incidents easier, which could result in a faster solution for the problem.
Proper Routing of Problem Management Information
Problem Management Platforms allows for automatic routing and escalation of problems based on the problems urgency and severity. The automated systems have the ability to classify all problems and route them accordingly. This results in most sever problems getting attention first down the line to incidents that effect applications very minimally. This can help avoid unplanned outages and downtime by giving the dangerous problems attention as they arise.
The advantage of the CMDB
Automated Systems makes reporting on high level or down to detail on every problem in real-time easy. The system documents every detail in the problem lifecycle and record it precisely and accurately in the Configuration Management Database. This documentation offers the ability to manage and report on Key Process Indicators (KPIs) such as root cause, trending in repeat problems, etc.
-
|