Tuesday, June 30, 2020

A Unified Interface that Monitors and Tracks Data Operations


Business Case:

IT project maintenance and continuance processes comprise multiple tasks that require regular monitoring. Monitoring and tracking are especially important when projects deal with real-time data.

A typical DataOps project might require tracking job statuses, database health, database access, server health, CPU usage, efforts, issues, and much more.

Monitoring becomes even more complicated when multiple development teams work under the same project, handling different streams and phases. Their data, resources, and deployment level interdependencies necessitate a high degree of collaboration. The effort required to manually track and monitor these details is exhausting, inefficient, and costly. Factors like refresh frequencies, requisite SLAs, and shifting stakeholder priorities complicate project structures.

Solution:

We integrated our production infrastructure with the capabilities of Power BI and PowerApps to create a single view of all the parameters to:

1.    Simplify the monitoring and tracking interface
2.    Reduce manual intervention points
3.    Reduce process complexity
4.    Improve collaboration among dependent teams
5.    Fix recurrent issues and streamline data flow

Solution Design:

We used Microsoft Power BI to create reports using real-time data from SQL Server applications to track:

  Job running status for on-premise and cloud services (completed, in progress, failed, scheduled, disabled) 
  Job history 
  Database status (health, stale tables, job trends, data latency, database access) 
  Server status (health, memory usage, CPU usage) 

Figure 1: Solution Design

We created a PowerApps-backed application to register and track any issues and their status (open vs. resolved). Our solution lets users enter the cause of an issue and the action taken to resolve it. Users can also track the effort put in by team members to resolve said issues.

Business Outcome:

A single-point view reduced the time and complexity of tracking and monitoring 40+ streams, 250+ jobs, 150+ databases, and 70 servers in real-time. It facilitated better collaboration across all the dependent teams.

Our solution enables us to retrospect, analyze, identify, and isolate the root cause of recurrent issues and process bottlenecks. This allows us to fix them with reduced amounts of manual effort. By limiting manual touchpoints, we improved our efficiency by over 60%. Our solution also optimized resource utilization, lowered chances of error, and increased quality control.

Highlights:

1.    Our unified interface improved overall efficiency by over 60%, with optimized resource utilization, lower error rates and better quality control.
2.    The single-point view stream reduced the time and complexity of tracking and monitoring jobs, databases, and servers.