Friday, October 23, 2020

Modernize Data Reporting and Management with Azure Databricks


Business Case:

Our client, a global software giant, relies on Power BI reports to compare their product performance against competitors. Report data is highly confidential, only accessed by 100 members of our client’s international marketing and operations team. Due to the industry value of our client’s data, security is a high priority concern.

Our client wanted to reduce data processing costs and avoid data replication. Our client previously used SQL to store and process data. Recently, our client moved most upstream data sets and operational systems to Azure-based solutions. To ensure continuity, we proposed a new system architecture based on Azure Databricks (ADB) to process their data. ADB allows developers to mount and transform data from the source, meaning data is processed without being replicated.

In partnering with us, our client’s goal was to migrate data processing from SQL to ADB without interrupting business continuity.

Key Challenges:

  Migrate 10+ data sources from SQL to ADB while enabling business continuity 
  Implement high-level security that restricts access based on business category and location  

Our Solution:

We migrated our client from an on-premises SQL data processing system to a scalable Azure Databricks environment that can access numerous Azure-based data sources. 

Figure 1: Solution Architecture

We designed our new ADB solution on top of the existing system. To enable business continuity, we migrated our client in phases. During phase one we established a basic system with all data streams. Then, we migrated our client’s eight Power BI reports one by one. All reports use an optimized star schema that reduces dataset size and report load time.  

The ADB platform’s scalability offers a data management interface that can connect to various Azure data sources. Reports are updated daily using an import model that regularly refreshes back-end data. The ADB environment’s automated shutdown capabilities enable easy and cost-effective data refresh.  

The reports offer high-level, role-based security. We enabled Power BI’s row-level security (RLS) on all reports to ensure users can only access data within their geography/business area. 

Business Outcomes:

Migrating our client from SQL to Azure Databricks (ADB) enabled faster, more cost-efficient access to critical marketing and operations data. Our client’s previous system used multiple expensive virtual machines (VMs). By migrating to an ADB solution, we saved our client 50% in costs.  

ADB is optimized to our client’s Azure-based upstream data sources, enabling seamless data management throughout the pipeline. ADB is built to last and can be easily scaled for growing business needs. 

We migrated eight Power BI reports to our client’s new system that offer key insights into the business pipeline, opportunities, and competitors. Power BI’s row-level security (RLS) capabilities ensured role-based access to confidential information. In addition, our solution reduced report refresh time by 50%, from 40 minutes to 20 minutes, offering business leaders quicker access to key insights. 

Highlights:

    Migrated eight Power BI reports and 10+ sources from SQL to Azure Databricks (ADB)
    Increased cost savings by 50%
    Reduced report load time by 50% (40 minutes to 20 minutes)
    Secured confidential information with row-level security (RLS)