Introduction

When a major IT incident strikes, it’s not just the technical fixes that count – how you communicate also makes a big difference. This balancing act involves keeping everyone, from your front-line IT staff, the C-Suite or your clients, in the loop with the right information.

In this post, I’ll dive into the nuances of effective communication during such events. You’ll learn how to craft messages that resonate with each audience, ensuring they’re informed, reassured, and ready to respond.

In a previous post, I looked at how to define incidents and to summarise, ITSM incidents are scenarios marked by severe service disruption or a substantial risk of such disruption – critical events that can adversely affect business continuity, customer trust, and the company’s bottom line.

Identifying Stakeholders

In the B2B (Business to Business) realm, your stakeholder map can often include direct business clients, their end-users, and sometimes, third-party service providers. These clients might depend on your services for their daily operations, so their primary concern is how the incident might disrupt their workflow and service delivery to their customers.

In contrast in B2C (Business to Consumer), e.g. retail, you may interact directly with individual customers. In such scenarios, stakeholders are primarily concerned with how the incident affects their personal experience – be it shopping, service availability, or data security. Communicating with these stakeholders requires an understanding of their specific expectations and needs.

Tailoring the Message

Effective messaging in a major IT incident goes beyond knowing your audience; it’s about delivering the right content in the right way. Each group – the IT team, executives, and clients – has different concerns and needs different information. Here’s how to tailor your message for maximum clarity and impact.

  • For your first responders it’s all about actionable details. They need a deep dive into the technical aspects – error logs, system failures, triage steps taken and so forth. Imagine a complex database issue; the team would need specifics like query failures, server load issues, and potential patches or workarounds. The focus here is on providing them with a clear action plan and the technical data required for a swift resolution.
  • The executive teams, however, need to grasp the business implications. It’s about translating technical disruptions into business language – like customer impact, service downtime, and potential financial implications. For example, in a data breach scenario, they would need to understand the scale of the breach, the type of data compromised, potential legal repercussions, and steps being taken to mitigate the situation. Here, the emphasis is on strategic response and maintaining business integrity.
  • When communicating with clients, it’s essential to balance honesty with reassurance. Clients need to know what’s happened, but in a way that doesn’t cause panic. For instance, in a service outage, explain the issue without technical jargon, assure them about data safety, and provide a realistic timeline for resolution. This approach is about maintaining trust and demonstrating commitment to resolution and service quality.

Communication Timings and Channels

Timing and choice of communication channels is as crucial as the content. Immediate initial communication is essential to establish awareness and control the narrative. For example, sending an immediate alert via SMS, Teams or Slack can inform all stakeholders of the incident’s occurrence, followed by more detailed email updates.

Regular updates, perhaps hourly or as significant developments occur, are crucial to maintain stakeholder trust and manage expectations. Choosing the right communication channels is equally vital. While email is suitable for detailed messaging or formal communication, instant messaging platforms may be better for rapid, real-time updates to other internal teams. In contrast, for external stakeholders like clients, a mix of emails, company website updates or social media posts might be appropriate, depending on the incident’s nature and the company’s communication policy. The key is to maintain a balance, ensuring stakeholders are informed but not overwhelmed with information.

Consequences of Poor Communication

Beyond the immediate disruption caused by the incident itself, ineffective communication can have serious long-term consequences. Delayed or inaccurate information can erode trust with stakeholders or customers, leading to frustration, confusion, and even panic. This can damage the company’s reputation, impact loyalty, and potentially result in legal repercussions, especially in cases involving data breaches or privacy violations.

Case Study

Consider a global online retailer experiencing a major website outage during a peak shopping period. The IT team receives immediate, detailed technical alerts, while management gets briefed on potential revenue impact and recovery timelines. Customers are informed via the website and social media about the issue and expected resolution time.

When AWS servers went down in 2017 causing widespread outages, Netflix quickly communicated to customers via social media that the issues stemmed from the hosting provider and that no user data/accounts were compromised. These clear, honest updates helped retain subscribers and reinforced their commitment to transparency.

Best Practices

Effective major incident communications rest on three pillars: clarity, consistency, and accuracy. Your communications should be clear and jargon-free to be understood by all parties. Consistency in your updates helps in managing expectations and maintaining a sense of control. Accuracy is critical; misinformation can have far-reaching consequences, damaging trust and credibility. Always cross-check your information before it goes out and try to prepare communication templates for likely scenarios ahead of time, so you are not writing them in the heat of the moment. Remember, in these situations, how you say it is as important as what you say. For example, the below template could be used to begin email internal communications to the executive team and iterations could be prewritten to tailor the message for various systems or other stakeholder groups.

P1 Incident Email Template

Subject: URGENT: [Incident Name] – Major Distruption ([Service/System Affected])

Priority: P1

To: [List of Recipients]

Ticket ID: [Ticket reference to track the issue]

Incident Summary

At [Date] [Time] a P1 Incident was detected impacting [service/system name(s)]. This incident is classified as P1 due to its critical nature and potential for significant business disruption.

Current Status:

  • [Briefly describe the current state of the incident, including symptoms and impact]
  • [Estimate the number of users/systems affected]
  • [State if any workarounds are available]

Resolution

  • [Outline the initial steps taken to resolve the issue]
  • [Provide an estimated timeframe for resolution (if possible)]

Impact:

  • [Outline the potential business impact of the incident, including lost revenue, productivity, or reputational damage]
  • Advise of any actions users can take to mitigate the impact]
  • [Inform which teams are impacted]

Communication

  • Additional notifications will follow every [30 minutes] regardless of status change until the issue is resolved

The Incident Response Team are working hard to restore normal service as quickly as possible and apologise for any inconvenience caused as a result of this Incident.

Sincerely,

[Incident Manager Name]

Advanced Communication Methods

As technology evolves, so too do the communication methods available for managing IT incidents. While traditional channels like email and phone calls remain important, organisations can leverage advanced tools to further enhance their communication efforts. Incident response platforms provide a centralised hub for tracking progress, sending automated alerts, and collaborating with stakeholders in real-time. Chatbots can be deployed to answer common customer questions and provide basic support, freeing up IT staff to focus on critical tasks. Exploring these innovative methods can significantly improve the efficiency and effectiveness of communication during major incidents.

Conclusion

Effective communication during major IT incidents goes beyond merely sharing information; it’s about crafting the right message, for the right audience, at the right time. Excelling in this art is essential for any IT professional looking to make a mark in ITSM.

Stay connected with Lean Tree as we continue to provide you with practical guidance, industry knowledge, and expertise to make the most of your ITSM endeavours. If you have specific themes or topics you’d like to explore further in subsequent blog posts or would like to discuss how we can support your technology transformation, please feel free to get in touch!

How can we help you?

Let's have a chat.

Latest news & insights

MACH Impact Awards Winner

MACH Impact Awards 2024

employee itsm review meeting

Navigating the Aftermath: The Importance of Post Incident Reviews

three happy business people using gadgets office

ITSM Major Incident Communications