Proactive operations – how to shift from reactive to proactive operations
One of the biggest challenges at most IT organizations is to transform itself from reactive to proactive operations model. ITIL framework provides the maturity levels for all processes or functions which in large part help to decipher whether you run the reactive or proactive IT organization. It seems easy. Common sense dictates to perform maturity level assessment, define transformation and change. Well, it’s actually hard work and requires commitment of senior stakeholders to honestly focus and implement transformation plan from reactive to proactive operations.
The classic reactive model of operations within IT organization is often characterized by ITIL framework as process with following:
- Minimal management commitment.
- No or little process or function governance.
- Most if not all activities are uncoordinated with little or no consistency.
- Only few, if any, procedures exist.
- No definition of roles or processes. Individuals performing the roles get no formal training.
- No operations automation. Operations are handled on manual ad-hoc basis.
- There is little service or customer focus and feedback is not captured or looked after.
The above points summarize the definition of reactive process at its worse. That is not to say that such reactive operations don’t get the job done. The company might be doing well thanks to its staff and experience they gained while on the job. As long as operations team members react promptly to any alerts that they get and monitor events regularly it may seem like there is no need for proactive operations. Incidents happen but they are addressed and resolved so why spend an effort for optimization and proactive strategy. Right? Wrong.
Reactive IT operation is very vulnerable to any staff issues such as staff moves, soft skills issues, or lack of training programs as well as low motivation or on the job burnout. Reactive operations even with the best tools and additional resources can only get faster in being reactive but it won’t help proactively prevent problems and improve internal and external customer satisfaction.
Proactive is defined as intervening in, controlling or driving an expected occurrence or situation. The most mature level of ITIL framework, optimized, is all about management governance, control and leadership focus in all activities within IT organization without any exceptions. Optimized maturity level can be distinguished by visible and ongoing management commitment to:
- Complete process or function governance.
- All activities are coordinated with consistency throughout all processes or functions.
- All procedures are fully documented.
- All roles for processes or functions are defined. Individuals performing the roles undergo formal training and verification.
- Operations are mostly automated. There is no ad-hoc operation processes.
- Customer focus and feedback is captured and verified against best practices.
- Continual improvement activities are in place to ensure continuous customer experience satisfaction.
To truly become proactive with your IT operations you must be in control of them. Team members must come together to cause the processes or functions to do what they expect. A change in the IT operations strategy is imperative to make a transition from reactive to a proactive operations ecosystem. Different operation phases cannot be treated as independent isolated steps at moving the customer needs from concept to a working product that is generating revenue for the business. It must be looked at as part of a whole. Each team dependent on another working together to deliver business value from design to production ready in a timely manner.
Shift from reactive to proactive IT operations
In complex world of interdependent processes and globally distributed teams change in ways of working is not a piece of cake task. Analytical and technical capabilities may not be always available within the IT organization to assess the requirements and propose ideal value effective solution. Every solution planning should be undertaken with caution and without rush. We suggest the following structure to achieve the best and long lasting results.
Any large scale shift creates a tremendous potential and at the same time high risk to get it right. There are few crucial points we would like to highlight that align all the activities to reach the goal.
Every process change provides the biggest rewards when it has strong leadership and commitment from the higher management. General chaos is a very strong incentive to acquire management commitment. However, that is seldom the case. Typically, the leadership must see the benefits in customer satisfaction, greater governance, regulations compliance or increased revenue to be able to approve any large scale change of ways of working IT operations.
Change in processes or functions will often have an impact on IT operation platform and may evoke redesign of the enterprise architecture. Customized enterprise architecture that meets the business needs is essential to business success in the digital economy.
Baseline ITIL maturity level measurement
In order to make any kind of shift or change it is important to current state.
There are 5 levels of maturity within ITIL framework. Initial level 0 described as general lack and/or partial absence of process or function where there is no defined structure nor responsibilities which leads lack of consistency. Level 1 is characterized by no governance over the processes or function, only some records of activity exist depending on skills of the current participant, and no automation is in place. Level 2 is characterized by some degree of management commitment, participant’s roles are recognized but not clearly defined or documented, and operations are still on ad-hoc basis. Level 3 is characterized by visible management commitment, services and process roles are assigned to knowledgeable and trained participants, and there is some level of automation within the organization. Level 4 processes or functions are stable and rarely fail, documentation is complete and covers all aspects of the organization, and great deal of pre-emptive measures and continual improvement plans are in place. Level 5, the optimized maturity level, is characterized by effective control, governance and focus visible throughout all levels of the IT organization.
The process of performing assessment we described in another our article. Please refer to it for insight and strategic planning.
Proactive Operations Strategy Pillars
There are many important factors that should be part of proactive operations strategy. The specific recommendations and focus points depend on, but are not limited to the organizational structure in terms of services as well as human resources, current level of operations maturity and most importantly business strategy.
Throughout the years, we found in our practice the following major Service Operations functions: Monitoring, Event Management and CMDB. In order to shift from reactive to proactive operations the company must identify the current level of ITIL maturity and then start drafting the transition plan. Below we describe the most important characteristics of Monitoring, Event Management and CMDB.
Monitoring is a function of ITIL Service Operation responsible for repeated observation of a Configuration Item, IT Service or Process to detect Events and to ensure that the current status is known.
Events are predetermined by Configuration Items typically include IT Services, hardware, software, buildings, people and formal documentation such as Process documentation and SLAs. Event is triggered to enforce specific action to take place.
The most advanced form of monitoring is Active Monitoring where for example fully automated processes are setup to notify help desk of any events that need special treatment before a disaster occurs. The next level of automation is to trigger automatic events based on the monitoring events.
Configuration Management Database is the central database of IT service management and as long as it’s a repository of accurate and up-to-date information, it can provide the foundation for successfully aligning IT to any number of business activities, from supporting regulatory requirements to a business analytics project. CMDB stores Configuration Items (CIs) with its attributes and its relationship to other Configuration Items. Currently there are CMDB solutions available from vendors that enable IT operations and administrators manage the hardware infrastructure.
Important part of the process is the careful identification of CIs that make up the infrastructure, including the type of CIs, its relationships, ownership and lifecycle status. Part of the CI identification process is agreeing terms on how the CIs are filed and classed – are virtual machines assets, for example, and what operations need to be tracked against them that make them assets? If the data you collect is inaccurate, incomplete, or not validated, you will fail to meet your implementation objectives. Using standard names, unique identifiers and rules-based reconciliation to ensure accuracy in the process helps to maintain clarity and easy maintenance of the CMDB for the long term.
Event is change of state which has significance for the management of a Configuration Item or IT Service. Event is also used to mean an Alert or notification created by an IT Service, Configuration Item or Monitoring tool. Events typically require IT Operations personnel to take actions and often lead to Incidents being logged. Event Review and Closure process also makes sure that Event logs are analyzed in order to identify trends or patterns which suggest corrective action must be taken.
Event Management relies heavily on CMDB and Monitoring ITIL processes. The main objective is to set up and maintain the mechanisms for generating meaningful Events and effective rules for their filtering and correlating. When events are structured and correlated the organization could move to the next level of event automation where certain events serve as automatic triggers for specific subsequent events.
Proactive Operations Risks
Although Proactive Operations enable IT organizations to utilize their infrastructure in most efficient manner, sometimes mistakes are made.
Lack of IT Service Modeling
IT Service modeling typically strives to create models that provide a comprehensive view of the analysis, design, and architecture of all ‘Software Entities’ in an organization. Service modeling typically encourages viewing software entities as ‘assets’ (service-oriented assets), and refers to these assets collectively as ‘services’.
IT service modelling projects fail over and over again due to inadequate tools, manual processes, and ever-changing IT environments. In order to succeed an organization needs to design and create specific service with proper relation to all services that it will be in contact and connect accurate and complete event monitoring of the service. Although it sounds so simple it is amazing how many organizations don’t get it right.
Finally, it is crucial to include IT Service Modeling as part of the continual process improvement. Periodic review of all service modeling maps and comparison with real life data will enable to find and correct any model pitfalls.
In our practice we often notice the tendency to over-automate event management too quick. Often senior managers are pressing to automate as many processes as possible. Another issue is the speed of automation. It is not wise to roll out event automation fast. But if large scale and fast roll-out is needed due to business strategy the most important part of event automation is complete and thorough testing with all counterparts necessary.
Depending on the size of operations, event management is a lengthy process and takes significant effort. At the same time when you do it well your organization will benefit from extra free time to concentrate on the business objectives.
CMDB Automatic Discovery Tools
Modern IT organizations are too dynamic and subject to constant change which makes auto-discovery tools essential. But this in itself carries risks. In a bid to create a more complete inventory some organizations populate the CMDB with every conceivable CI, including desktops, laptops and even smartphones.
Auto discovery allows you to capture more CIs more quickly but it doesn’t mean you should. IT should focus on the most critical items within the server estate that reflect the business strategy. Define strategic objectives, roles and precise business sponsorship so that everyone is agreed on what you are trying to achieve, and therefore, what you need to discover in the process.
With proper understanding of current IT operations situation, the shift from reactive to proactive operations can focus on the most important business strategy goals. Depending on situation one solution may be focusing on processes Event Management and Incident Management with the emphasis being the inter-connection between them. Identification and measurement of key matrices can help the business to reduce Mean Time to Resolution (MTTR) and increase the Mean Time Before Failure (MTBF), two key factors in increasing levels of service availability.
Moving from reactive to proactive operations requires good analysis. With identification of some key objectives and risk that we identified here it will be easier to achieve a smooth migration to fully proactive operations.