Why you’ve got IT Monitoring and IT Alerting All Wrong

it alerting

IT alerting and IT monitoring are not what they used to be. In years past, software releases were scheduled a few times per year. Often, one monitoring tool would review the infrastructure and would catch and spit out alerts. Sorry, but those days are gone. Nowadays, start-ups use containers and microservices, continuous integration and delivery. As such, monitoring can and needs to be at multiple points along the pipeline. If you are not taking the time to calibrate your systems to reduce the amount of noise and ensure effective alerting, then you’ve got monitoring and alerting all wrong. Don’t worry though. It’s not a death sentence – thankfully. There are clear methods for turning IT monitoring noise into actionable IT alerting. Come on feel the noise It’s not just a catchy line from Quiet Riot. ‘Come on feel the noise’ also encapsulates how many engineers in IT Ops experience monitoring. Because of […] Read more »

Fight Alert Fatigue

output_TAyP7X

How to Win the Alert Fatigue Battle IT engineers and DevOps teams cannot help but experience alert fatigue when they receive after-hour alerts lacking context or relevance. Messages come in, for example, telling the engineer on-call that disk space is used up. Does this mean 60% used up or 100% used up? Or an after-hours message might come in alerting to a downed server. Which server? Did the back-up server come on-line as a result? The remedy then is to implement an IT alerting system that differentiates high priority alerts and allows for messaging with attachments. Lack of context can cause significant frustration among engineers as well as alert fatigue. Impact of Alert fatigue Companies shouldn’t downplay the impact of alert fatigue. There are also significant financial implications for companies if they have stressed out, unhappy, sleep deprived engineers. For example, engineers who are feeling the stress of alert fatigue are […] Read more »

7 Ways DevOps Can Avoid Alert Fatigue

7 ways to avoid alert fatigue

Being on-call doesn’t have to mean you’re always tired The introduction of monitoring into the DevOps world means alerts will occur 24/7. As such, there will be alert fatigue in DevOps. Monitoring needs alerts in order to be effective but the issue is that while our technology is 24/7, humans cannot work in a similar fashion. Clearly, 24/7 alerts need to be better calibrated with human physiological realities in order to avoid alert fatigue. The remedy then is to implement an IT alerting system that differentiates high priority alerts and allows for messaging with attachments. Alert fatigue in DevOps The traditional setup of IT and DevOps is such that email is the main form of relating issues such as deployment problems or server problems. If software fails to deploy correctly, an email goes to a designated engineer. Similarly, if a server experiences a power surge, an email is sent. Monitoring […] Read more »

Differentiate your MSP from the Competition

How to differentiate your MSP

The habit of highly effective MSPs At OnPage, we spend a lot of time speaking to MSPs and learning about the growth in IT services. There is an increasing realization by businesses that having the right firewall, endpoint security and monitoring tools is not a job that small businesses can or should attempt to accomplish on their own.  These small and medium businesses need the advice and counsel of an experienced technology provider. Important to note as well though is that as the potential for profit increases, so will the potential for competition. Where there is profit, there is competition. And if all MSPs are providing a similar package of goods, how can an MSP differentiate themselves? How to differentiate your MSP OnPage recently visited IT Nation in Orlando. It was a great conference that allowed us the chance to talk to a number of clients as well as the chance […] Read more »

7 steps to creating an actionable IT on-call schedule

How to make sure IT on-call works for you I spent a bit of time on Reddit the other day and thought it interesting just how many posts were focused on IT on-call and on-call scheduling. Some posts were rants on horrible customers – who hasn’t had some of those? Some actually wrote about positive interactions from being on-call – those were rare posts. But many engineers in DevOps and IT posted on their trepidation about being on-call. They wondered: What is the best way for my team to create an IT on-call schedule? How do I ensure I wake up if I am alerted? Should my growing on-call team use an on-call cell phone and hand it off between rotations? How do I manage being on-call and then having to show up at 8 a.m. the next morning? Is it reasonable to expect on-call duty 24/7? The answers to […] Read more »

OnPage Shows Off its ConnectWise integration at IT Nation 2016

How does OnPage connect wisely with ConnectWise? A question many asked us at IT Nation 2016 in Orlando last week was ‘How do you connect with ConnectWise?’ The simple answer is, through a direct API. ConnectWise’s robust ticketing system sends a message when a ticket is created. OnPage turns that message into a loud and audible alert. Many MSPs use RMMs like LabTech to monitor multiple endpoints. When an issue arises such as a downed server or a potential malware infection or one of many other incidents, the RMM sends an alert to ConnectWise. ConnectWise in turn creates a ticket and based on the alerting profile set by the user – such as alert me when the coffee pot is empty or there is a serious client issue –  sends a callback to OnPage. OnPage integration enables your MSP to shine By integrating with OnPage, your ConnectWise system now enables automatic priority alerts based upon ticket […] Read more »

Cygnus Systems Decreases Costs and Improves Response Time

OnPage facilitates Cygnus Systems’ growth of 25% per year through efficiency, cost reduction and improved response to critical alerts By using OnPage’s cloud-based, virtual paging app in conjunction with ConnectWise, Michigan-based MSP Cygnus Systems Inc. has grown faster than Plasticman. Through OnPage integration, Cygnus has been able improve its response time to critical IT alerts , grow its profits by over 25% in the last year alone and provide consistent service to its customers. In writing our case study on Cygnus’ successful use of  OnPage,  we talked to the company’s operations manager Craig Isaacs to learn how our technology has enabled the company to manage over 1100 endpoints and grow significantly. According to Craig, OnPage provides him with the necessary confidence to assure clients that IT downtime will be managed and mitigated through immediate and persistent alerts.  ConnectWise’s innate ticketing capabilities are triggered by an alert from LabTech.  Upon receiving the alert, ConnectWise […] Read more »

OnPage Ensures Advanced Network Products’ Alerts Are Heard

Incident Management Gets a Voice By using OnPage’s cloud-based, virtual paging app in conjunction with ConnectWise, Pennsylvania-based MSP Advanced Network Products (ANP) has been able to dramatically improve its response time to critical IT alerts and better manage SLAs. ANP’s biggest pain point prior to OnPage was hand-offs of after-hours alerting by the NOC team. OnPage’s technology has provided the solution by providing alerting to its IT on-call policy. NOC team oversight means less burden for IT on-call engineers In writing our  case study on Advanced Network Products, we talked to the company’s Mike Silverman to learn how our technology has enabled the company to solve their pain point. According to Mike, prior to using OnPage, there was little oversight of the NOC team. As such, ConnectWise tickets that needed to be escalated to the on-call engineers weren’t. With OnPage, that oversight is enabled. Where email fails, critical alerting succeeds Previously, all alerts […] Read more »

A Tale of HIPAA Compliance

physician

How a physician’s request violated HIPAA Compliance Earlier this week we received an interesting request from one of our customers. The customer is a physician who faced the following issue in the OR: I am a physician who finds it inconvenient to constantly have to give the room circulator my pin number to see a page. Is there some workaround to this? HIPAA Compliant Smartphone We immediately realized that while this might be a pain point for this physician and many other doctors, we could not comply with the request as it would be a HIPAA violation. Hospitals and clinics comply with HIPAA by having all communication tools with patient information automatically lock after 15 minutes of inactivity. This requires users to log back in. It is understandable that logging back into the doctor’s smartphone after a period of inactivity can be an annoyance and even slow down workflow. However, […] Read more »

Critical Alerting Is the New Black

the new black

WHY CRITICAL ALERTING IS A MUST HAVE FOR INCIDENT RESPONSE TO BE EFFECTIVE At OnPage, we spend lots of time thinking about how to apprise people about the importance of critical IT alerting. So when our CEO Judit Sharon emailed me about an upcoming webinar from HIMSS entitled: Incident Response is the New Black – A Must Have For Your Security Strategy, I was intrigued. I wanted to know how the speaker would entertain alerting in incident response. As it turned out, while the webinar was fascinating, the discussion focused on response dictionaries and procedures and did not mention critical alerting at all. I was left wondering, why would an organization focus solely on incident response to save their IT? How can an organization have an effective incident response platform without a critical alerting platform? In the vein of fashion, wouldn’t the scenario be like having pants but forgetting to button […] Read more »

Why you need to fail to be great at DevOps

devops fail

Seven steps to failure and greatness The more I read and learn about how to succeed in DevOps the more I realize how important failure is to the process. You need to fail to be great at DevOps. Netflix, for example, even takes it a step further by introducing failure into their testing process. In our blog The Seven Deadly Sins of DevOps, we wrote about how you should not do DevOps. Interestingly enough though, failure is not a sin. In fact, failure is something you should strive for. This blog will give you a sense for how you can plan to fail strategically. Embrace DevOps and fail fast. Fail to be great How do you succeed by failing? It sounds like a contradiction. Simply put, it’s by building failure into the testing process. Think of it as ‘controlled failure’ whereby you think strategically about where the system is likely […] Read more »

Feel the burnout

Everyone on your team is feeling the pain.

Eleven practical ways for DevOps engineers to better manage their work environment At OnPage, we know the importance of devops burnout and have explored in other formats such as our e-book and video. The seriousness of the issue is highlighted by the following components: Decreased employee happiness. Employees become less satisfied and content with their work Decreased productivity. Because employees are fatigued, they are less productive Frequent job shifts. Throughout the industry, it has become standard for engineers to switch jobs every 2 to 3 years in hopes of finding employment that won’t burn them out. How to recognize devops burnout How do you realize that you are suffering from burnout? It’s like the famous description of a frog in boiling water. The frog only knows he’s going to die when it’s too late. Similarly, the engineer only knows they are suffering burnout when they have either burnt bridges or broken friendships […] Read more »

Visit OnPage at the IT Nation Conference 2016 – Booth 202

IT Nation

IT Nation: The Conference Where MSPs Go To Geek Out OnPage is proud to be a Silver Sponsor and exhibitor at IT Nation 2016 in Orlando, Florida from November 9th-11th. Stop by booth 202 to get one of our very cool t-shirts and enter to win one of the many prizes we’re raffling off. Schedule a time to chat with us before or during the conference by signing up for a meeting. We’d love to speak with you. Why Onpage goes to IT Nation In its 12th year, IT Nation is the largest conference of its kind for MSPs and IT solution providers. This is OnPage’s second year attending IT Nation and we are coming back because last year we found it incredibly valuable to meet with so many technology providers and learn from so many experts. Not to mention, it’s great to leave the cold in Boston and arrive in […] Read more »

When tools are out of control you need critical alerting

critical alerting

Tools going Rogue – a story for Halloween We have all heard stories of DevOps woe. Some tales are sad. Some tales describe true misfortune. And some tales just leave you thinking what the heck were developers thinking? This story is a tale of the later. This story will tell the tale of how some developers at a start-up in New England created code which was supposed to live and work on AWS EC2 servers. However, the developers never thought to test what they were spinning up or to put critical alerting in place for when things went wrong.  And that is where our tale of woe begins. Tale number 1: Automation destroyed the world What the tool was supposed to do has long since been forgotten but the horror and nightmares it caused will not go away so soon. At the start-up I am referring to here, every protocol […] Read more »

Paging like its 1994: Unencrypted pagers create a hackers’ paradise

HACKERS PAGING

Hackers use a $20 Dongle to Hack Messages from Unencrypted Pagers We recently published a blog article that highlighted seven reasons healthcare should end its love affair with pagers. We should have been more inclusive in our recommendation and suggested all industries that use pagers give them the proverbial boot. An article that came out this week in The UK Register reiterated the problems befalling those who choose to remain loyal to pagers.  The article noted how by using pagers, organizations allow amazing amounts of information to be gathered on themselves, their patients, the company and their passwords. $20 spells the end of secure messaging The many industries that still use pagers today such as nuclear power plants, substations, power generation plants, chemical plants, defense contractors, and other industrial environments like semiconductor and commercial manufacturers, and heating, ventilation and air conditioning (HVAC) companies are putting themselves at risk. The only […] Read more »

OnPage releases new version for AppStore

app store

OnPage polishes Apple image with new release OnPage is pleased to announce the release of a new OnPage app. The new app runs on iOS8 and up and has many new features which customers have been clamoring for. These features are also designed to maximize user experience and facilitate the critical alerting process for healthcare and IT. New OnPage app has significant upgrades There are six major updates that users will find in the newest release: Users can now select multiple contacts at a time from the OnPage address book. Now, you can alert multiple contacts with one alert Users can forward attachments to another user. If you receive an attachment with an OnPage alert, you can forward the attachment text, image or voice attachment to another user Users can view the other OnPage IDs in a particular group Sort by Contacts or Groups. Rather than henpecking for a contact […] Read more »

The Seven Deadly Sins of DevOps

7 deadly sins of devops

What to avoid when you start DevOps While there are many ways to do DevOps correctly, there are specific cardinal sins that will put you afoul of the Church of DevOps. From lacking an incident management tool to handle critical alerts to treating DevOps as a job title, there are many ways for you to hurt your status as an A-class DevOps shop. In order to achieve excellence in DevOps, it is key for executives to avoid committing the cardinal sins of DevOps that are discussed below. DevOps sin 1: You treat DevOps as a title, not a philosophy In speaking to directors of engineering at numerous companies, I have heard the phrase: ‘if you have Devops in your title, you’re doing it wrong’. The point of this statement is that DevOps is a philosophy, not a title. You shouldn’t assume that you can simply put the word ‘DevOps’ in someone’s title and get anywhere […] Read more »

Netflix earnings, DevOps and profitability

netflix

DevOps as the road to profitability Netflix released a great earnings report earlier this week. According to the Wall Street Journal’s page one article on October 18th, “Netflix Inc. blew through its forecast for the subscriber additions in the September quarter…sending its shares soaring 20% in after-hours trading. … The better-than-expected performance came mainly in international markets, where the company has completed a massive, near global expansion this year.” For anyone who reads the DevOps literature, this success doesn’t come as a surprise. Rapid testing and provisioning is the name of the game at Netflix. Puppet’s 2016 State of DevOps Report, notes that Netflix is among the top DevOps performers: High [DevOps] performers deploy on demand, with Etsy deploying 80 times per day, and large companies like Amazon or Netflix deploying thousands of times per day. So how tightly are the components of profitability and DevOps excellence intertwined? Perhaps another […] Read more »

Serverless promises and the persistent need for critical alerting

critical alerting and serverless computing

Why serverless computing doesn’t end the need for security or alerts Serverless computing provides the advantage of taking away the problem of managing servers. For many small start-ups, this is a huge advantage as the cost of purchasing, maintaining and scaling servers is a real pain point. Serverless also holds forth the prospect of ending the need for Ops as we know it, ending the need for security worries and ending the need for being on-call. But, while this modern-day DevOps marvel known as serverless might seem like a panacea, serverless computing needs to come with a healthy dose of reality. The reality of serverless In an article I recently posted to DZone entitled How Smart Is Serverless, I question how smart it is to outsource your security concerns to a third party like AWS. As I note in the article, you cannot abstract security without facing some pretty scary consequences. Amichai […] Read more »

NoOps and the Need for Critical Alerting

critical alerting with serverless

NoOps eschews critical alerting at its own peril Many start-ups’ embrace serverless architectures such as AWS, believing they will be able to adopt NoOps. NoOps means no worries about servers as everything is on the cloud and if there are no worries about servers then there is no need to worry about critical alerting. The reality is slightly different. No matter how minimized Ops becomes, there will always be a need for strong incident management applications. The emphasis will simply further push monitoring from an Ops-only role to an important role for everyone on the development team. What is NoOps and why is there so much criticism? NoOps defines an IT environment that is so automated and abstracted from the underlying infrastructure that there is no need for a dedicated team to manage Ops in-house. The two main drivers behind NoOps are increasing IT automation and cloud computing. Even among […] Read more »

Seven reasons why we need pager alternatives in healthcare

devon with i suck pager

Pitch the pager or face the consequences While smartphones have become the manner of communication for most professionals, pagers are still used in many professions as the standard mode of communication. This choice for pagers comes with significant consequences and indeed many suggest the need for finding pager alternatives in healthcare. For example, pagers are not a secure method for communicating as pager messages are unencrypted. They leak information. Pager communications can also be hijacked. What is surprising is that while pagers have many significant reliability issues associated with them, many doctor and healthcare organizations continue to use them rather than seek a pager alternative. Stickiness While the overall popularity of pagers has been in decline over the past several years (see Google trends graph), pagers still maintain stickiness in the healthcare industry. In part this reason is historical as pagers have been used to alert doctors since the 1950s and […] Read more »

OnPage Critical Alerting with Microsoft Outlook

monitoring systems

Let OnPage and Microsoft Office Monitor your Network While it is well known that the Microsoft Outlook client allows you to use rules to forward emails, you might not have ever realized that you can use these rules to enable critical alerting. In fact, critical alerting with Microsoft Outlook is easily enable no matter what monitoring tools you use. Monitoring tools can send emails to your Ops team in case of an outage or security issue. Your Dev team might get an email based on an APM issue. However, rather than having the emails get buried under a mountain of ever growing messages on the email client, you could have critical messages alerting for a ‘server crash’ or ‘slow program response time’ forwarded directly to your OnPage application. Critical alerting with Microsoft Outlook and monitoring tools In our previous blog on email integration with Outlook, we showed how based on various rules […] Read more »

OnPage Ensures Compatibility with Apple iOS10

ios10

OnPage functionality complimented by Apple’s iOS10 OnPage has ensured compatibility with Apple’s iOS10 operating system. Apple’s newest operating system has only been around since mid-September but OnPage team has ensured there is no pause in compatibility. The iOS10 update provides functionality that many in the OnPage world will find helpful. Specifically, the new operating system allows users to add callback codes such as *67 before pasting a phone number into the keypad. Dialing *67 before a phone number deactivates caller identification, or caller ID, and the system shows “private number” instead of an incoming phone number. For practitioners who receive OnPage alerts and need to call back their patient or customer, they can now copy the number from the alert to the keypad, add *67 as a prefix and call back the patron without having to reveal their cell number. Additionally, iOS10 also enables Universal Clipboard so you can copy content across Apple devices. This […] Read more »

OnPage Integration with Microsoft Outlook Email

send an alert from any device that sends email

How to play by the rules Email integration has been our bread and butter since we started the company in 2011. As such, we think it important to explain how using the powerful, dependable and reliable logic inherent in Microsoft Office’s email platform, allows you to  easily set up a framework to get notified in OnPage in case of an alert. First rules first According to a 2015 report, Microsoft Office 365 products are used by 4 out of 5 Fortune 500 companies. That’s a huge number of companies using Outlook. So with every monitoring system on the planet having the ability to send email, your ability to send emails to your Outlook account is ensured. Assume you are an Ops engineer tasked with monitoring your company’s servers. You want to know when a server is experiencing an error. Do you want to sit around and wait and watch? Of course […] Read more »

Do you know Continuous Delivery?

blackbelt

Are you a Continuous Delivery blackbelt? OnPage is cloud-based incident alerting and management platform that elevates notifications on your smartphone so they continue to alert until read. Incidents can be programmed to arrive to the person on-call and can be escalated if they are not attended to promptly. Schedule a demonstration today! Please follow and like us: Read more »

Constant Vigilance in Continuous Delivery – How to do DevOps right

Alastor Moody's mantra of  "constant vigilance" is applicable to DevOps.

The importance of monitoring and alerting in the continuous delivery cycle In the 2016 State of DevOps report, Puppet reported that the top DevOps shops like Amazon or Etsy deploy new software releases multiple times per day. The next tier of companies deploy on a weekly or monthly basis. What is the difference between High and Medium IT performers? More than just tools, it is mindset. In addition to building code, the top DevOps teams have adopted a mindset of “constant vigilance”, as Mad-Eye Moody said to Harry Potter.  Not only should teams always be building but they should also always be testing. Through testing they should remain constantly vigilant – attuned to their software and its performance. And equally important in this process should be incident alerting to let both Dev and Ops know when things go awry. A cautionary tale: What happens without constant vigilance While many DevOps […] Read more »

How well do you know ChatOps? Is your ChatOps knowledge world-class?

blackbelt

Are you a ChatOps blackbelt? OnPage is cloud-based incident alerting and management platform that elevates notifications on your smartphone so they continue to alert until read. Incidents can be programmed to arrive to the person on-call and can be escalated if they are not attended to promptly. Schedule a demonstration today! Please follow and like us: Read more »

ChatOps – Secret to great incident management in DevOps teams of 20 or 2000

ChatOps

Chat your way to excellence DevOps is constantly trying to improve production through automation, collaboration and tools. ChatOps is often the paradigm which brings these tasks together into a single conversation. In ChatOps, “chat applications and tools for real-time communication and task execution [are distributed] among members of development and IT operations teams”.   Yet often times the proponents of ChatOps don’t pay sufficient attention to the incident management component part of the operation, preferring instead to look at the bots and chat room tools. However, as James Fryman noted in his talk on ChatOps: Technology and Philosophy at Geekdom San Francisco “the shared context [of chat rooms] allows everybody to see and collaborate around things that happen. This is super amazing [for] the incident management space.” Specifically, though, when high priority or critical alerts occur, notifications need to be used to broadcast the incident beyond the chat room and ensure the conversations […] Read more »

How well do you know DevOps? Is your DevOps knowledge world-class?

Are you a DevOps blackbelt? OnPage is cloud-based incident alerting and management platform that elevates notifications on your smartphone so they continue to alert until read. Incidents can be programmed to arrive to the person on-call and can be escalated if they are not attended to promptly. Schedule a demonstration today! Please follow and like us: Read more »

OnPage ensures compatibility with Android 7.0

nougat - android

Maintaining its customer and technology focus OnPage, the leader in incident alerting and management for IT, healthcare and IOT, released the updated version of its application to ensure compatibility with Google’s Android 7.0 N operating system. Affectionately known as Nougat, Android’s newest operating system was only around for a few weeks before the OnPage update was made. Support engineer Alex Berkson indicated that creating the update was important for ensuring that OnPage users using OS N can receive critical notifications without interruptions. The OnPage update continues to support the capabilities the app has historically demonstrated such as: Secure alerting: All messaging is encrypted Persistent messaging for up to 8 hours: Also known as alert until read, alerts will continue for up to 8 hours before expiring Split Screen on Samsung devices Real time communication: Instantaneous messaging with individuals or team members Forwarding: Send messages to your team members to bring […] Read more »

The Secret to Making Your DevOps Team World Class

linkedin image dev ops team

Continuous deployment is key to world class DevOps With their State of DevOps report released at the beginning of the summer, Puppet clearly defined the characteristics of world class DevOps organizations and the make-up of those lagging behind. According to Nigel Kersten, CIO of Puppet, there is a huge gap between organizations that get DevOps and are able to ship software on demand and “organizations that take days, weeks or even years to ship simple upgrades … and the gap is widening”. Where is your company on the spectrum? Is your company deploying 80 times per day like Etsy or thousands of times per day like Amazon? Is your company one of those that spends 50% less time remediating security issues than low performers, and 22% less time on unplanned work? How much time does your team have for building new code? Perhaps you don’t even know the exact answer […] Read more »

OnPage app helps prevent HIPAA violations and ransomware attacks

hospital ransomware atacks

OnPage helps thwarts HIPAA violations Avoiding HIPAA violations and combating potential ransomware attacks are top of mind for many healthcare institutions. In theory, avoiding HIPAA violations should be straight forward as HIPAA requirements are very specific in what they do and don’t allow hospitals to do during communications with their staff physicians. HIPAA legislation clearly prohibits the use of devices that do not have: user authentication data encryption remote wipe capabilities delivery and read receipts date and time stamps customized messaging retention time frames specified contact lists And yet in spite of these specific requirements meant to help ensure patient privacy, hospitals are the biggest target of ransomware attacks. According to one article, 88 percent of all ransomware attacks target hospitals. Hospitals become vulnerable to ransomware because events as basic as a lost or stolen iPhone expose strategic information. The issue is best summarized by the following paragraph in a recent article from Becker Hospital […] Read more »

MTTR by the Numbers

How MTTR is key to effective business

Ignoring MTTR can cost you millions Peter Drucker was famous for saying that “If you can’t measure it, you can’t improve it.” That’s why knowing your MTTR (mean time to resolution) is so important. If you don’t know how long it takes for you to fix issues, you cannot improve on that time. Effectively, MTTR is an important shorthand for your team to know how well they are doing in acknowledging and responding to software, hardware or deployment failures. Here are 3 key ways your team can improve their MTTR: Identifying the root cause is usually the biggest cause of MTTR variability and the one that has the highest cost associated with it. Ensure that information goes to the right person.  When a monitoring system detects an issue and sends an email, use OnPage to make sure that the correct engineer is alerted. Have escalations enabled so that engineers can reach […] Read more »

What you need to know about MTTR and why IT MaTTeRs

MTTR3

What all engineering teams should know about MTTR In the IT world, performance is everything. So when technology fails, your first thought is how to utilize incident management knowledge to repair the situation and minimize downtime. As both a manager and an engineer, you need to minimize your MTTR –Mean Time To Resolution- in order to comply with your SLAs – service level agreements – and keep your group at the top of its game.  This article will highlight the issues impeding effective MTTR management and offer insights on how to improve use of MTTR as a metric. Who cares about MTTR I have put the importance of MTTR out there and have not defined to whom in particular the metric is important. But the truth is that just about everyone in engineering uses MTTR to measure how long it takes their teams to resolve an incident after it has […] Read more »

The secret to blameless post mortems

blameless post

How your engineering teams can move past finger-pointing to effectively managing mistakes Sidney Dekker’s theory on ‘bad apples’ holds that complex systems think they would be fine if it were not for the erratic behavior of some unreliable people. According to this theory, when unexpected events are seen in an otherwise safe system, they are typically and conveniently assigned to “human error” and when they are severe to “operator carelessness”. Similarly, post mortems often look to define and parcel out blame to engineers. Yet it begs the question of how effective the post mortems are if their only purpose is to assign blame. Instead, effective post mortems needs to “acknowledge the human tendency to blame, to allow for a productive form of its expression, and constantly refocus the postmortem’s attention past it.” Post mortems vs retrospectives The problem with post mortems begins with its name “post mortem”, which if you ask […] Read more »

OnPage uses Tropo to create bilingual office

Slide3

Spanish-language phone tree facilitates after-hours at doctor’s office OnPage and Cisco Tropo Back in February of this year, OnPage announced its integration with Cisco’s Tropo technology.  Tropo’s cloud-based API platform enables OnPage to embed real-time communications within the OnPage critical alerting application through the use of voice commands.  Tropo makes it simple for OnPage’s developers to use the Cisco capabilities to create new experiences. CEO Judit Sharon knew that this was a great ‘a-ha’ moment as it would allow many unexpected and serendipitous integrations. But what Judit did not realize was quite the way in which that serendipity would unfold. What we’ve got here is a failure to communicate At the beginning of the summer, OnPage received a call from the offices of Dr. Hector Lopez in Milwaukee, Wisconsin. Eric Eickhorst, the office manager at Milwaukee Family Practice, called OnPage as he was trying to find a way to help […] Read more »

Bringing Dev and Ops together with on-call groups

on call scheduler

Make Dev and Ops better together by building empathy with on-call groups   Create Effective Schedules Much has been written on the tension that often exists between Dev and Ops teams in an organization. All too frequently, Devs are focused on rapid prototyping and creating code while Ops are focused on keeping the ship stable and making as few changes as possible. When I was at the DevOps Boston Conference last week, much of the “hallway conference” was devoted to conversations on how to build empathy between these frenemies and make them exist in less opposition to one another. How can Dev and Ops become less siloed? How can management encourage cross pollination? One important psychological realization was that in order to create empathy between these two groups and ensure an effective group dynamic, the teams need to spend more time living in one another’s shoes. One strong and significant step that can […] Read more »

Why Pagers Suck!

hippa complaint

HIPAA Violations are only the beginning If you’re a healthcare professional you probably use pagers to communicate with your office and with others in your practice. But did you know that using a pager could cost your office $650K for a HIPAA violation? That seems like a lot of money to spend for the liberty of using a $5 device. HIPAA Requirements According to the HIPAA regulations, healthcare organizations using pagers must Ensure that all communications are encrypted Ensure that a system of message accountability is implemented Enable remote removal of messages from a pager to protect the integrity of PHI in the event of a pager being lost or stolen Enable a process for user identification on each device Enable an automatic log-out facility to prevent unauthorized access to PHI when a smartphone is left unattended Unless you have borrowed a pager from the future, your pager doesn’t meet any […] Read more »

Critical Alerts Without a Data Package

Slide3

When the Wi-Fi magical garden is gone and the message must get through Most of us don’t even think twice about internet connectivity. We get it at work or Starbucks or at home. It’s one of those low-cost goods in our society which can be produced in vast quantities as needed with minimal effort. And while $30 per month might seem pretty close to free, many consumers and business individuals outside the United States don’t live in our magical wi-fi garden. Got tech support in India? So, imagine instead that you are a company with tech support in India where nearly 900 million people do not have access to the internet. Or perhaps you have support in the Philippines which has the third lowest average connection speed in the region. Further, imagine that you have a critical alert you need to get out to your engineers about a downed server […] Read more »

No Data Plan? No Problem. OnPage’s Page to Phone Feature Ensures You’re Never Out of Touch

Maintain Your On-Call Schedule No Matter What Today, OnPage announced the release of its Page to Phone feature. This feature provides customers and IT management with phone redundancy to existing prevalent alert.  The OnPage alert already provides continuous and prominent notification for up to 8 hours until the message is read. With Page to Phone, OnPage provides customers who don’t have access to the data channel an additional way to get notified. The feature allows customers to receive a phone call that they can acknowledge via smartphone and mark as ‘read’ in the audit trail. How Page to Phone Works The Page to Phone redundancy feature is easily initiated and enabled with a checkbox in the OnPage enterprise console, where you can also choose Email and SMS redundancy messaging.  In the interface, you provide the OnPage ID, name, phone number and email address to which the alert should be sent. You […] Read more »

See You at Boston DevOps Days 2016

OnPage is proud to be a Gold Sponsor at DevOps Days Boston taking place at the Boston Park Plaza on August 25-26th.  If you’re attending, please stop by and get one of our very cool t-shirts and enter to win one of the many prizes we’re raffling off. If you want to schedule a time to chat or see some of our newest features, please sign up for a meeting with us. We’d love to speak with you. Uniqueness of Boston DevOps Days – where to start When Patrick Debois and Andrew Shafer coined the term DevOps back in 2009, they probably did not expect their concept for bringing together agile development and code deployment to become a worldwide phenomenon. However, today cities around the world host DevOps conferences and each conference has its own flavor. Boston’s attendees are not surprisingly from the high tech and biotech communities. Attendees from […] Read more »

50 Ways to leave your pager: Why it’s time to think pager replacement.

Why is it time to think about pager replacement? Perhaps the need for finding a pager alternative is clear to many realize the inefficency and time waste caused by using the device. These facts are highlighted in an article from 2014 entitled Why Are Hospitals Still Using Pagers?, Cliff McClintick writes “If you read the marketing copy of nearly any large hospital or health system, you’ll see references to “state-of-the art equipment” and “cutting-edge technology…. [C]onsumers have a right to expect the claim to be true. But if they or their family members experience delays in care because a doctor hasn’t returned a page, their faith may be shaken.” Moreover, the beep on a pager doesn’t rise above the constant alarms and sounds of other machines in hospitals or data centers. And often pager holders can sleep through their alarms. At OnPage, we have thought long and hard about how […] Read more »

Why proper incident management is key to proper IT management

Proper IT management requires proper incident management. Otherwise, you court Murphy’s law at your peril. In the IT world, if a server can fail, a cache overload or traffic overload the network – it will. And the consequences are significant. Many IT organizations face database, hardware, and software downtime, lasting short periods to shutting down the business for days. According to a January 2016 article in Network Computing on the high price of IT downtime, organizations face: “an average of five downtime events each month, with each downtime event being expensive indeed: from $1 million a year for a typical midsize company to more than $60 million for a large enterprise.” The major cause of this downtime is equipment failures which account for almost 40% of downtime. The second most frequent cause of downtime is human error which accounts for 25% of downtime. Cybersecurity accounts for only about 10% of […] Read more »