Bringing Dev and Ops together with on-call groups
Make Dev and Ops better together by building empathy with on-call groups
Create Effective Schedules
Much has been written on the tension that often exists between Dev and Ops teams in an organization. All too frequently, Devs are focused on rapid prototyping and creating code while Ops are focused on keeping the ship stable and making as few changes as possible. When I was at the DevOps Boston Conference last week, much of the “hallway conference” was devoted to conversations on how to build empathy between these frenemies and make them exist in less opposition to one another. How can Dev and Ops become less siloed? How can management encourage cross pollination? One important psychological realization was that in order to create empathy between these two groups and ensure an effective group dynamic, the teams need to spend more time living in one another’s shoes.
One strong and significant step that can be taken towards growing empathy between Devs and Ops is creating strong cross-functional on-call teams. In this scenario, Dev engineers would sit on Ops teams “on-call” groups and vice-a-versa. Security groups are considered a great starting point for this type of team due to the importance it plays in the overall development and deployment of a SaaS operation. Devs often don’t have much experience with security and as such, the issue is often handled by Ops who have a good eye and strong understanding of the issues. Yet Devs are often interested in this issue so it becomes a fertile ground for creating appreciation of each other’s expertise.
To be clear, there are also many issues where Ops could learn from Dev teams such as the need to iterate software quickly in order to advance the product. No matter the scenario, the creation of strong on-call groups becomes a great starting point for introducing empathy.
The problem is though that most of the major critical alerting platforms do not allow the creation of multiple schedules. The logic built into these SaaS implementations determine that if you are part of one schedule you cannot be alerted in a second schedule. Schedules are treated as mutually exclusive. In theory having schedules act like this prevents individuals from being called for two separate events at the same time. But the reality is that this set-up prevents the creation of multiple schedules. Devs and Ops cannot be put into multiple on-call groups.
On-call schedules vs. On-call groups
OnPage however uses groups rather than schedules. Groups DO NOT have time limitations and constraints. Schedules are bound to not conflict with other schedules. This logic for schedules prevents having Devs sitting in Ops rotations while also sitting in their own rotation. The cross-pollination discussed above is not allowed. Groups instead DO allow an engineer to sit on multiple on-call teams. The engineer can be on an Ops team for one issue and a Dev team for another. If an individual happens to be called by both groups at the same time, the engineer can always escalate the request to another member of the group or use multi-tiered escalation policies that allows for a second tier of on-call engineers. By having a designated secondary rotation, cross-functional DevOps can continue to work outside of their silos while maintain the SLAs and response rates. These first-tier teams can continue to support multiple groups in an escalation policy.
Group schedules – an effective management tool
At the DevOps conference last week, it seemed like management often got blamed for not doing enough to creatively bring Devs and Ops together. However, by seeding the on-call teams, management would inevitably:
- Align incentives. Devs will not want to ship software that has performance or stability issues that will alert them to an “on-call” situation or (worse) wake them up in the middles of the night when things break.
- Per Gene Kim, do nothing without measuring. If you seed teams, each side will have a better view of how their actions impact the metrics of success. If Devs sees their code causes numerous on-call incidents and hurts Ops up-time metrics, engineers will change their code in response. Metrics for up-time would strengthen.
- Better understanding of the development to release process. By creating on-call groups, both Devs and Ops will have a much stronger understanding of the questions faced by each side. Through this understanding empathy would be created and a better team dynamic would result.
Leave it to OnPage
If you are a manager, creating on-call groups that have both Dev and Ops engineers on them is a very strong first step to getting your team to work better together. Furthermore, bringing them together on the same team is really at the core of what DevOps is all about. DevOps is not meant as just a fancy portmanteau to describe an operations team. DevOps is a mantra about how programming teams and IT teams can work better together. By using on-call groups from OnPage, you will create empathy, understanding and ultimately better code and a better product. Get started today.