Thursday, September 24, 2020

Reduce Cloud Costs with Proactive Anomaly Detection


Business Case:

Our client, a leading software company, employs a worldwide team of cloud solution consultants. The consultants offer scalable cloud packages to customers of varying business size. To deliver cost-effective packages to customers, the consultants needed real-time cloud consumption insight.

Key Challenges:

  Visualize real-time customer cloud consumption activity and trends  
  Proactively detect critical spikes and reductions in cloud usage 
  Report usage changes to consultants 
  Ensure data insights are easily accessible

Our Solution:

We built an anomaly detection model that identifies spikes and reductions in cloud usage and reports the fluctuations to consultants.

Figure 1: Solution Architecture

Anomaly Detection Model Criteria:


  Removed accounts with insignificant usage.
  Performed exploratory statistical techniques.
  To account for variance, flagged accounts with high standard deviation.
  Set logic to define usage spike value.
  Performed trend analysis on last 12 months’ data and identified a threshold value for each service within overall cloud usage.
  Set a spike notification to occur when the usage value is greater than the threshold value of the previous week’s usage.
  Implemented seasonality check to avoid email spam.

When the model detects a spike, the model assigns a support ticket to the consultant in Power Apps. Geographical spike data is featured in a Power BI dashboard. Consultants and managers can respond in Power Apps using email or built-in messaging.

To automate email notifications, we used Microsoft Azure technologies: Azure Data Factory, Azure Databricks, Azure Data Lake Storage, Azure SQL DB, MS Flow, and Outlook. The business logic is handled using Python notebook on Azure Databricks and reporting logic is handled using dynamic HTML code.

Figure 2: Email Framework

Business Outcomes:

Case 1: One of our client’s new customers, a massive telecommunication company, was paying high costs for premium cloud capabilities. Our solution detected low activity, alerting the consultant that the customer had been overprovisioned. The consultant scaled back the customer’s package, saving the customer 50% in costs.

Case 2: Throughout the COVID-19 crisis, our solution detected significant changes in customer consumption trends as remote work increased. Consultants were able to proactively deliver solutions to new customers with remote needs.

Case 3: Our tool detected activity spikes in previously unmonitored accounts, alerting consultants to new customer engagement and revenue opportunities.

Highlights:

  Built an anomaly detection model that alerted cloud consultants of spikes and reductions in customer cloud usage
  Enabled cloud consultants to proactively support customers, reduce unexpected customer costs, and increase customer satisfaction
  Developed a one-stop solution for cloud consumption insight and communication