Friday, September 28, 2018

How We Improved Production Alerts with Webhooks and Microsoft Teams

Microsoft Teams is a workplace collaboration app that combines chat, task planning, notes, meetings, and attachments. We use teams in many areas of our workplace, from UX to HR.

Recently, we wanted to devise a way to use Teams to alert production engineers of P0 development issues, aka “blockers.” Specifically, we wanted alerts to tell developers whether the Azure VM environment was running critically low on memory.

In the past, we relied on email alerts. If a team member missed reading the email, however, the project was halted. We needed a system where team members would be informed via phone alert as soon as a P0 issue arose.

To create the alert system, we used Teams's built-in webhook integration. A webhook is an event-triggered API that allows information exchange between apps in real-time.

To integrate webhooks with Teams, we first created a separate Teams channel for production alerts. Next, we created a script that monitored server memory usage. The script scanned server performance logs and generated an alert when available memory fell below 10 GB. The alert triggered an immediate webhook warning on Teams, which instantly informed all team members of the issue via mobile push notification.

Figure 1: Memory usage script
Figure 2: Webhook warning
We subsequently developed additional alerts for server failures and disk usage. There are many possibilities for expanding webhook integration with Teams. Alerts can even be custom formatted with integrated action buttons. We’ve found the alerts to be an invaluable tool that allows us to resolve P0 issues nearly instantaneously.