Why proper incident management is key to proper IT management

Companies across the IT industry encur major costs from downtime. This image is courtesy of evolven.com

Proper IT management requires proper incident management. Otherwise, you court Murphy’s law at your peril. In the IT world, if a server can fail, a cache overload or traffic overload the network – it will. And the consequences are significant.

Many IT organizations face database, hardware, and software downtime, lasting short periods to shutting down the business for days. According to a January 2016 article in Network Computing on the high price of IT downtime, organizations face:

“an average of five downtime events each month, with each downtime event being expensive indeed: from $1 million a year for a typical midsize company to more than $60 million for a large enterprise.”

The major cause of this downtime is equipment failures which account for almost 40% of downtime. The second most frequent cause of downtime is human error which accounts for 25% of downtime. Cybersecurity accounts for only about 10% of this downtime. Yet in each of these cases, traditional workflows use emails to alert those in charge of downed networks. The use of email alerts assumes – falsely – that an email will get the attention of a data center manager. Yet data managers are faced with 100s of other emails per day. Clearly, an email doesn’t break through the noise and get noticed in this instance.

Best practices for effective incident management during downtime

While effective use of network monitoring tools is required to minimize the impact of downtime, using emails to provide effective response means you are expecting the person responding to the incident is sitting at their computer or hovering over their iPhone. And what happens when the servers go down at 3am? One hopes even the most devoted of employees is asleep at that hour.

Furthermore, traditional pagers are inadequate as they go off and then go silent. Pagers, when used either as an alternative to email or in addition, don’t always escalate and they don’t persistently get the attention of the necessary individual. Instead, you need data security control tools coupled with proper incident management applications. This means, that when incidents do occur the appropriate individuals are alerted and the alerts don’t stop until the requisite action happens.

Impact of having solutions in place

Mitigating downtime requires good workflows, human response and – most importantly – proper alarms to alert relevant individuals when things go wrong. Proper incident notification is crucial to effect management of IT downtime. And there’s more than just the cost savings. There’s also the savings to reputation. If a company frequently experiences downtime to its IT infrastructure, then it is courting a besmirched reputation for lacking reliability. When a company has a bad reputation, business is more difficult and costly to conduct. Much of the writing on customer service notes that it is more difficult to retain customers and important stakeholders when a company’s reputation is damaged. This, in turn, makes the costs of doing business significantly higher.

Conclusion

Of great importance in this is that while you cannot avoid every incident, you can ensure proper incident management. In their attempts to provide proper alerts when trouble raises its ugly head and things go south, heads of IT need to ensure there are proper alerts that rise above the clutter.

Want to learn more about how alerts helped one IT team drive down response time? Download our whitepaper.

Facebook

Google

Twitter

OnPage Corporation

Next 50 Ways to leave your pager: Why it's time to think pager replacement. »

Previous « OnPage for Yahoo Stock Alerts

Published by

OnPage Corporation

Tags: IT managementIT Support

9 years ago

From Tickets to Action: Ensuring Proactive IT Support with Jira and OnPage
We're excited to announce the launch of our bi-directional integration between OnPage and Jira! This…
The Need for Full-Stack Observability
In a recent survey, it was discovered that 57% of software developers' time is spent…
Top IT Conferences 2025-2026
Top IT Conferences of 2025 IT conferences offer valuable opportunities to build lasting partnerships and…

From Tickets to Action: Ensuring Proactive IT Support with Jira and OnPage

We're excited to announce the launch of our bi-directional integration between OnPage and Jira! This…

3 days ago

critical communication and alerting

OpsGenie End of Life? What’s next for OpsGenie users.

If you haven’t heard already (which would be shocking considering the numerous posts I’ve seen…

4 days ago

clinical communication and collaboration

Reflections from HIMSS 2025: Conversations, Challenges & The Future

HIMSS 2025 is in the books, and after days of conversations, sessions, and navigating the…

3 weeks ago

IT Alerting

The Need for Full-Stack Observability

In a recent survey, it was discovered that 57% of software developers' time is spent…

3 weeks ago

clinical communication and collaboration

From Beeps to Breakthroughs: How Mobile Apps are Taking Over Pagers in Healthcare

In recent years, the healthcare industry has been facing a pivotal shift on the communication…

4 weeks ago

Healthcare thought-leadership

Why OnPage Outperforms Epic Secure Chat for Critical Communication

Electronic Health Records (EHRs) like Epic are undoubtedly pivotal to modern healthcare. With their intuitive…

1 month ago