Major Incident Management
"Developing a Single Enterprise Major Incident Management Process for a Transport Agency"
When a significant IT-related service disruption occurred, a major Transport Agency realised the need to formalise their major incident management process. Kirk Penn, from Service Management Specialist, was engaged to develop a single, standardised process for responding, restoring, and recovering from major IT incidents.
The Transport Agency IT group supported over 30,000 internal staff and hundreds of complex systems. While the existing day-to-day incident management process worked well for lower priority incidents, major disruptions left the agency struggling to cope. The challenge was to establish a cross-agency major incident working group and get everyone to follow a single way of working under pressure, amidst competing operational priorities.
Kirk developed a strawman major incident communications model and approach that clarified inputs and triggers, roles and responsibilities, communication guidelines, and governance for technical and management conference bridges during a major incident. He also drafted a policy and process, creating simplified one-page overviews for each stakeholder to understand their role during each phase of the MIM process. Senior management were briefed, and a significant rollout campaign, including simulations, was undertaken to ensure all stakeholders were clear on their contributions in the event of a major incident.
The major incident management process was successfully implemented and remains a stable and valuable capability within the transport agency IT group. As a result, the Transport for NSW IT leadership gained confidence and endorsed further investment into resources and the centralisation of the MIM function. Additionally, a version of the MIM communication model was adopted for managing P2 incidents, providing increased controls for lower priority incidents and reducing the likelihood of these becoming major incidents.