How to modernize SLAs and ITOps

 “Tell me how you will measure me and I will tell you how I will behave.”

Dr. Eliyahu Goldrat – The Goal

Digital transformation will come from increased investments in technology. But, at the same time, it will also come from increased scrutiny and commitment to quality of the existing services. To ensure this outcome, companies must be committed to the quality spelled through common acronyms such as SLAs, MTTR and MTTD. While these concepts have been around since the time of gurus such as Edward Deming, they are as critical as ever to ensuring technology meets its goals and sets itself up correctly for the future. SLAs and ITOps can inform each other and bring about mutual modernization.

This blog will look at SLAs and ITOps and their role in maintaining and modernizing IT. Additionally, the blog will look at how to use SLAs to produce extraordinary teams.

Why do we need SLAs

Service level agreements (SLAs) define a contract between a service provider (either internal or external) and the end user indicating the level of service expected from the service provider. SLAs are output-based in that their purpose is specifically to define what the customer will receive. SLAs do not define how the service itself is provided or delivered.

For SLAs to be effective, the level of service should be specific and measurable as this will allow the quality of service to be benchmarked and rewarded or penalized accordingly. The measurement of service is often defined through measures such as mean time between failures (MTBF) or mean time to recovery, response, or resolution (MTTR). Through these measurements, SLAs can become a lodestar that allows managers to provide metrics by which to measure important quality achievements such as how long it takes until outages are addressed and how long until they are resolved.

But how do these metrics translate into actual work? One needs to look no further than Netflix’s Chaos Monkey. Netflix provides a fascinating exercise in seeing how well team members can maintain service uptime during simulated chaotic incidents of downed virtual machines. By maintaining exacting levels standards on how long it can take until service is repaired and uptime is achieved, Netflix has in part become the posterchild for effective DevOps. They have also become an exacting example of how SLAs can be used effectively to improve overall business quality.

Netflix demonstrates how SLA management is critical to ensure that these SLAs are kept up to date and that the agreed performance standards for the service levels are not breached.  In essence, SLAs are how companies turn exacting standards of performance and excellence into specific requirements that enable the engineers and technicians to meet those standards.

SLAs burn the path forward

If the end user experience is what matters most then the SLAs that companies create must be centered on achieving a high-quality experience. The best way to achieve this is for managers to help encourage SLAs and ITOps to create standards based on the end-user experience. Once this is achieved, the SLAs need to be monitored closely through SLA management to check whether the service meets the service level targets agreed upon. From a concept to reality, there must be a practical understanding for how to achieve the standards introduced by SLAs.

SLAs provide a service agreement for various technology agreements such as:

  • How long will it take until an engineer recognizes and responds to a downtime incident.
  • How long until an engineer begins repair of an issue
  • How long until an issue is resolved

These analyses provide a valuable input for the creation of a Service Improvement Plan (SIP) – a critical part of SLA management.  A Service Improvement Plan defines a formal plan to implement improvements. A SIP consists of several elements such as:

  • What is the process or service of that needs to be monitored
  • Who is the person in charge of the service (service owner)
  • What are we measuring in this instance  e.g. server uptime
  • What tools and techniques should we use to achieve our SLA

IT service providers create their products to be used by customers. An SIP is key to making the SLA into an effective management tool to address specific services and needs.

Conclusion

SLA management has become an integral part of effective IT management and growth. Effective SLAs require that services are monitored, reviewed and analyzed regularly to find improvement points and steps. Your team has the ability to achieve these results. First though, your team must acquire the right plan and the right mindset.

To learn more about how to make SLAs and ITOps an integral part of your team’s success and what tools can help you achieve this outcome, download our whitepaper: Are you SLAcking?

OnPage Corporation

Share
Published by
OnPage Corporation

Recent Posts

Site Reliability Engineer’s Guide to Black Friday

Site Reliability Engineer’s Guide to Black Friday   It’s gotten to the point where Black Friday…

6 days ago

Cloud Engineer – Roles and Responsibilities

Cloud engineers have become a vital part of many organizations – orchestrating cloud services to…

4 weeks ago

The Vitals Signs: Why Managed IT Services for Healthcare?

Organizations across the globe are seeing rapid growth in the technologies they use every day.…

1 month ago

How Effective are Your Alerting Rules?

How Effective Are Your Alerting Rules? Recently, I came across this Reddit post highlighting the…

1 month ago

Using LLMs for Automated IT Incident Management

What Are Large Language Models?  Large language models are algorithms designed to understand, generate, and…

2 months ago

OnPage Lands Spot on Constellation ShortList™ for Clinical Communication in 2024

Recognition highlights OnPage's commitment to advancing healthcare communication through new integrations and platform upgrades. Waltham,…

3 months ago