Friday, June 28, 2019

Case Study: Migrating a Partner Sales Attribution System to the Cloud



Key Challenges

   Improve data processing time.
   Increase scalability and on-demand availability.
   Improve infrastructure management.

Moving to the Cloud

We recently implemented a partner sales attribution system for a large software supplier. The partner sales attribution system successfully accommodated over 20,000 daily visits (3,000 users daily) and consolidated data from over 50 sources. But the system faced challenges in four areas:

   Slow data processes and refreshes.
   Poor server performance.
   Inefficient infrastructure management.
   Inefficient infrastructure patches.

To overcome the challenges, we decided to transition the sales attribution system to the cloud.

Creating the Cloud Architecture

To solve the four challenges, we studied cloud assets and executed a series of proof of concepts. We explored Azure Data Lake Storage, Azure Data Lake Analytics, Azure Analysis Services, Azure Data Warehouse, and Azure SQL Database. We conducted the proof of concepts to evaluate the effectiveness of the assets for our requirements. We benchmarked the results to the existing on-premise setup. We then conducted a cost-benefit analysis of the proposed cloud migration.


Following the proof of concept stage, we began the platform migration. We staged the data in Azure Data Lake Storage (ADLS) using Azure Data Factory. We processed the data using Azure Databricks. The data then moved to Azure Data Warehouse (ADW) for downstream users. We applied internal processing requirements with security-based access principles to the dataset. We also implemented a just-in-time (JIT) processing framework to process tabular models on top of ADLS and ADW. The framework enabled the independent integration, processing, and publishing of the assets to the reporting layer. The reporting (mart and tabular) landscape enabled data scientist testing and catered to Power User requirements via Power BI and Excel. Users could also export data automatically to PowerPoint presentations, Outlook, and Excel reports.

Our cloud architecture saved money and surpassed client expectations. The distributed processing power of the cloud-enabled faster processing. The serverless architecture eliminated server dependency among the streams. The cloud’s serverless, scalable, and distributed architecture reduced infrastructure management costs. The distributed architecture also improved infrastructure patches.

Migration Benefits

Moving to the cloud resulted in four advantages:

   Reduced data latency.
   Reduced maintenance, execution, and mounting costs.
   Near real-time reporting.
   Comprehensive cloud analytics.

Reduced Data Latency: Our client critically needed reduced data latency. To respond to changing market conditions, our client needed fresh data. Low data latency allows organizations to execute strategic business decisions more quickly. Low data latency also allows employees to generate ad hoc reports with up-to-date data.

Our client’s partner sales attribution system serves over 60,000 partners every day. 35 terabytes of data pass through the system each day. With the on-premise sales attribution system, data required anywhere from 18 to 26 hours to update. By introducing three major technical improvements—a master pipeline architecture, JIT processing framework, and in-house Azure Cloud best practices—data now requires only 4 to 12 hours to update. The master pipeline architecture enables modular and independent data refreshes. The JIT processing framework improves data processing speed and latency. The in-house Azure Cloud best practices refined the architecture and implementation. The best practices also improved the DataOps team’s monitoring and debugging capabilities.

Optimized Infrastructure: Moving to the cloud resulted in significant cost savings. The client’s former on-premise system incurred significant maintenance, mounting, and execution costs. The new cloud architecture adopted a pay-as-you-go structure. Instead of provisioning unnecessary resources, we implemented an efficient architecture using minimum resources. Our client reduced infrastructure costs by 15% and improved data latency by 60%.



Real-Time Reporting: Cloud assets allowed our client to selectively implement near real-time reporting for time-sensitive reports. These reports included a suite of sales performance reports. The sales performance reports provided our client’s executive team with an up-to-the-minute understanding of sales performance across the company’s major sales divisions.

Comprehensive Cloud Analytics: Currently, we are exploring analytics options for the partner sales attribution system. Cloud analytics offers our client three advantages. First, cloud analytics facilitates strategic business decisions. Cloud analytics neatly categorizes vast quantities of data, which is then available to users via their web browsers. Users drive their businesses using insights derived from the data. Second, cloud analytics provides a landscape for data scientists to derive additional insights. Cloud assets already deliver data, but cloud analytics enables data scientists to exhaustively test their own hypotheses. Third, cloud analytics enables platform-independent support for BI tools. Cloud analytics supports Power BI, Excel, and power users.

The cloud migration of our client’s partner sales attribution system concluded with significant gains for our client. Refresh cycles and data availability efficiency improved 2.3 times. Data consistency benefited from the newly created single source of truth for all the reporting layers. Near real-time data achieved a 15 to 20-minute publishing cycle for referral data. Infrastructure costs fell after automating maintenance, security compliance, alert monitoring, and updates.