What to avoid when you start DevOps

While there are many ways to do DevOps correctly, there are specific cardinal sins that will put you afoul of the Church of DevOps. In order to achieve excellence, it is key for executives to avoid committing the cardinal sins of DevOps that are discussed below.

DevOps sin 1: You treat DevOps as a title, not a philosophy

In speaking to directors of engineering at numerous companies, I have heard the phrase: ‘if you have Devops in your title, you’re doing it wrong’. The point of this statement is that DevOps is a philosophy, not a title. You shouldn’t assume that you can simply put the word ‘DevOps’ in someone’s title and get anywhere near implementing a DevOps-focused enterprise.

As Matt Juszczak of Bitilancer writes:

Calling yourself or somebody else a “DevOps Engineer,” a “DevOps System Administrator” or a “DevOps Tester” reflects a fundamental misunderstanding about the topic. This confusion is contributing to a lot of project/program frustrations and failures that are hurting teams and companies, sidetracking careers and creating backlash for recruiters.

Instead, DevOps needs to be realized as a methodology which takes software development and shifts “left”. More concisely,

It is the practice of operations and development engineers participating together in the entire service lifecycle, from design through the development process to production support.

DevOps sin 2: You don’t have buy in from everyone from the CIO on down

Success is ultimately about transforming the way the business thinks about software and its importance to commercial success. DevOps needs to be seen as fundamentally a business change, not a technology change. While DevOps is often associated with new tools and practices, the real change is the new way of working that aligns technology with business strategy.

Teams can have bits and pieces of DevOps philosophy sprinkled through a project’s implementation, but in order for the organization as a whole to benefit from DevOps it requires support from the top. This means for success, your CIO needs to be a Devops cham­pion – to buy in, support and in some cases be able to lead this effort.

DevOps sin 3: You don’t focus on metrics

In a previous blog article, I quoted Peter Drucker who famously said “If you can’t measure it you can’t improve it.” That noted, it is important to measure every step of the DevOps lifecycle in order to ensure that you are, for example, actually improving your release numbers, decreasing MTTR or minimizing the change failure rate. If you are not measuring these numbers then you have no idea if you are doing awesome or if you are a work in progress.

AppDynamics wrote:

The right metrics are essential to making sure that your DevOps transformation is successful. It’s important, though, to go beyond technology metrics. Metrics such as Mean Time To Resolution (MTTR) or Mean Time to Failure (MTBF) are important but you should also focus on process and people metrics. Things like monthly or daily active users, measuring the development-to-deployment lead time are also import­ant metrics to consider when measuring your current effectiveness

DevOps sin 4: Focusing on DevOps as a race to acquire more tools

Just like DevOps cannot be seen as a title in an org chart, it also cannot be thought of simply as a matter of tools. In the DevOps world there’s been an explosion of tools in release (jenkins, travis, teamcity), configuration management (puppet, chef, ansible, cfengine), orchestration (zookeeper, noah, mesos), monitoring, virtualization and containerization (AWS, OpenStack, vagrant, docker) and many more. DevOps engineers are famous for their love of new tools but at some point engineers need to specifically focus on achieving goals.

At the very least, a new DevOps tool should follow the Hippocratic oath and should do no harm to any team. And a true solution will appeal to Dev, Ops and Security. As one engineer writes,

If daily routines will be impacted by the new tool, getting early buy-in from affected teams is key. Otherwise it will be extremely hard to get the other team to adopt the solution and it will never realize its full potential.

DevOps is about breaking down silos and barriers so employees can get work done more quickly. That means having management buy-in, not just buying more tools.

DevOps sin 5: You think failure is unacceptable.

Companies might be automating correctly and have management buy in, but the development-operations team gets it wrong when they don’t embrace failure. Netflix, for example, actually tries to anticipate failure so they are ready for when scenarios such as downed servers or non-functioning code do show up.

But on the philosophical end, management needs to realize that failure is part of the practice of creating and releasing code. Rather than having painful post mortems that focus on finger pointing, teams need to focus on constructive, blameless post mortems that look at understanding the issues and how they can be avoided in the future. Ideally, a failed release is met with:

“new tests .. built around your mistake so that it’s caught next time and everyone acts like it’s simply another day. This is when you know your company has adopted an important devops philosophy.”

DevOps sin 6: You maintain the divide between Devs and Ops

Gene Kim notes effective DevOps “emphasizes the performance of the entire system, as opposed to the performance of a specific silo of work or department.”

As has been written in many articles describing the problems of DevOps, Dev and Ops cannot sit in silos that don’t speak to one another. Devs cannot create code and throw it over the wall when they are finished and expect Ops to deploy it. That just doesn’t work. Instead, Devs and Ops need to work as one team.

Often, this comes down to having both Devs and Ops on-call. If Devs see that their code is causing numerous problems and is waking them up in the middle of the night, they will be more conscientious about writing and testing their code. Similarly, if Ops sees the pressures Devs are under, they will be more sympathetic to their release cycle.

DevOps sin 7: Not using critical alerting tools

Relying on ineffective critical alerting tools to notify engineers of critical incidents will serve to magnify many other sins committed. If critical alerting tools are not part of the fundamental philosophy of a development-operations team, then:

  • the team’s ability to focus on metrics is diminished. If you don’t know when an incident occurs, for example, then how can you decrease MTTR?
  • effective post mortems will be much more difficult. Critical alerting tools like OnPage allow teams to see where the alerting might have gone wrong or if incorrect information was delivered in an alert.
  • the divide between Devs and Ops will continue. By putting both teams on ‘on-call’ duty, each can see what alerts were created. By having a clear vision of what is creating alerts, there will inevitably be empathy built on both sides.

Critical alerting tools are key to decreasing down time, maintaining customer satisfaction and resolving issues quickly. It is indeed quite sinful to ignore these points and continue with tools that don’t effectively alert your on-call team.

Conclusion

Committing any of these sins is not a mark that a company is irredeemable. Instead, recognizing the fault is probably the first way towards correcting it. The best advice is to work at unraveling one sin at a time.

Shawn Lazarus

Share
Published by
Shawn Lazarus
Tags: MTTR

Recent Posts

OnPage’s Strategic Edge Earns Coveted ‘Challenger’ Spot in 2024 Gartner MQ for Clinical Communication & Collaboration

Gartner’s Magic Quadrant for CC&C recognized OnPage for its practical, purpose-built solutions that streamline critical…

1 day ago

Site Reliability Engineer’s Guide to Black Friday

Site Reliability Engineer’s Guide to Black Friday   It’s gotten to the point where Black Friday…

2 weeks ago

Cloud Engineer – Roles and Responsibilities

Cloud engineers have become a vital part of many organizations – orchestrating cloud services to…

1 month ago

The Vitals Signs: Why Managed IT Services for Healthcare?

Organizations across the globe are seeing rapid growth in the technologies they use every day.…

1 month ago

How Effective are Your Alerting Rules?

How Effective Are Your Alerting Rules? Recently, I came across this Reddit post highlighting the…

2 months ago

Using LLMs for Automated IT Incident Management

What Are Large Language Models?  Large language models are algorithms designed to understand, generate, and…

2 months ago