Wednesday, July 17, 2019

Case Study: Sales Platform Improves Sales Manager Productivity



Key Challenges

   Automate product categorization in profit and loss hierarchy.
   Integrate sales and budget platforms.
   Reduce data processing time.

High Sales Volume Drives Need for Automation

Our client, a large software company, runs a chain of successful online stores and retail outlets. The stores generate millions of dollars in revenue each day. Because of the enormous number of sales, sales data is difficult to visualize and track. Without an easy way to interpret sales data, the client cannot resolve problem areas and improve customer relationships.

In 2014, our client approached us with a straightforward request. The client needed a way to more easily track sales data. Sales managers needed a platform to project sales and forecast data. Sales managers spent significant time on four tasks. First, sales managers manually categorized products into a profit and loss hierarchy. Manually creating the profit and loss hierarchy meant the chances of data discrepancies were high. Second, sales managers manually tabulated cost calculations (such as cost of goods sold). The client’s upstream data source did not provide the current cost for most products. Third, sales managers reported sales on one platform and used another for budgeting. As a result, the field budget for sales managers’ stores often differed from the number provided on the budgetary platform. Finally, sales managers could not track daily progress toward targets.

Automating Sales Manager Tasks

We devised multiple approaches to solve the sales managers’ dilemmas. We completed several proofs of concept and tracked results. We then approached each of the client’s four business problems individually.

To alleviate the need to manually create profit and loss hierarchies, we identified a better method to manage and store the data. We used Excel as a configuration model, so users could independently manage configuration changes. We then identified and selected the best scalable design. With our client’s continually expanding operations, scalability was critical. Lastly, we created scripts to automatically create the profit and loss hierarchies. The scripts automatically registered and categorized new products as needed, eliminating the risk of data discrepancies.

To ensure sales managers no longer needed to calculate costs, we created automatic cost rules. We created different cost rules for each level of the product hierarchy. To prevent duplication, we associated each product with only one cost categorization. To store the data, we selected a scalable design.

Next, we addressed the sales manager’s budgetary problems. We implemented a breakout view for budgeting and forecasting customized for each store. We first identified several metrics shared between the sales platform and the budgetary platform. Then, we used our refined understanding of the two platforms to determine how to split the margin between the two platforms. To formulize the split, we created multiple algorithms. We then shared the algorithms with our client to review. Following the review, we implemented our client’s choice of algorithm, eliminating the budgetary discrepancy.

Finally, to improve daily planning, we implemented a daily breakout view for budgeting and forecasting. We first identified several metrics to measure daily split data. We used machine learning algorithms to ensure we accurately tracked the data. We then shared the algorithms with our client for review. After our client’s review, we applied the best algorithm. The completed daily breakout view enabled sales managers to better track daily progress.

Building a Data Pipeline

After we completed the data forecasting and sales projection platform, our client asked us to create a data pipeline for store analysis. We created the data pipeline to address three key problems. First, we needed to reduce overall data processing time. Slow refresh times meant sales managers didn’t receive timely reports. Without timely reports, the managers couldn’t quickly address critical customer issues. We needed to transition our client’s tabular models to a scalable solution with improved processing. Second, we needed to introduce a custom item category mapping. With our client’s existing solution, individual products were selected for every metric. As a result, the likelihood of data discrepancies was high. Third, we needed to consolidate various data points (such as revenue, inventory, and labor).

To reduce the overall data processing time, we analyzed our client’s system. We identified the scope of optimization in data staging, data warehouse processing, and cube processing. We then removed redundant metrics from Power BI reports. After removing the redundant metrics, we formulized approaches to solve other parallel issues. We then monitored the effectiveness of the approaches. Our systematic testing reduced overall processing time significantly.

To introduce custom item category mapping, we used an Excel configuration. We expanded the configured categories at the product level. By expanding the configured categories, we began to automatically track data related to new products, eliminating the need for manual input.

To consolidate the disparate data points, we reduced the number of data sources. Staging all the data at one source ensured our client’s data pipeline was scalable even as new streams were added.

Integrated Platform Improves Store Performance

By the conclusion of our project, we had integrated a data forecasting and sales projection platform with a store analysis platform. The integrated platform automated critical sales manager activities. Sales managers now spend less time gathering insights and more time serving customers. The integrated platform greatly improved store performance. The new platform reduced refresh times, increased report reliability, automated manual tasks, and improved scalability. Sales managers now possess access to reliable data, enabling them to immediately address sales and budgetary concerns.

Wednesday, July 10, 2019

Case Study: Improve Power BI Premium Report Performance and Eliminate Data Refresh Failures



Key Challenges

   Improve report performance.
   Eliminate data refresh failures.
   Improve architecture design.

Achieving More with Power BI Premium

Our client, an internationally recognized humanitarian organization, needed to improve report load times and data refreshes. One critical report was not loading quickly, and data refreshes were not working properly. We worked with our client’s IT department and business users to better utilize Power BI Premium capabilities.

Power BI Premium provides dedicated capacity, dedicated capacity nodes, and viewing access without the need for additional user licensing. Dedicated capacity allows users to avoid the published restrictions of Power BI. Dedicated capacity nodes ensure provisioned computing power is customized to client needs. Viewing access without Power BI user licenses allows business users to view customized reports without incurring additional costs. 

Improving Report Performance

To improve report load time, we first analyzed our client’s report. We identified the report elements responsible for the slow load time. We then provided our client with recommendations to improve report performance. First, we removed the dates from dimension date tables for which data was not present in fact tables. We also removed all unnecessary filter interactions. Finally, we turned off the auto date and time features. Turning off the auto date and time feature resulted in a 40% reduction in the database size. The database reduction significantly improved the report’s performance.

Eliminating Data Refresh Failures

To eliminate data refresh failures, we first investigated our client’s data refresh schedule. We discovered the data refresh failures were due to parallel refreshes and a high number of queries. To separate the parallel refreshes and improve data refresh times, we changed the refresh schedule.

Improving Architecture Design

As we investigated the data refresh failures, we also discovered the size of the database was partly responsible for the failures. Our client’s database was sizable, and refreshing the database required a substantial amount of our client’s capacity. With our client’s permission, we scaled out our client’s architecture, increasing capacity. We also improved gateway performance and provided gateway troubleshooting documents.

Understanding Power BI Premium

As the engagement progressed, our client turned to us for additional guidance about how they could better utilize Power BI Premium. To provide our client’s business users with a technical understanding of Power BI Premium, we provided a walk-through of the following capacity metrics:

   System Usage
   Refreshes
   Evictions
   Query Duration
   Query Waits
   Model Size

Our client’s improved understanding of capacity metrics enabled further learning opportunities. Following the walk-through, we explained how to use capacity metrics to move databases across capacities for load balancing. We also shared and explained capacity planning suggestions. To ensure our capacity plan suggestions met client needs, we analyzed our client’s Power BI audit logs. The audit logs allowed us to identify the reports accessed during peaks in capacity metrics. We then discussed the findings of the audit logs to ensure our client understood the significance of their usage patterns. The engagement ended when our client was confident they could maximize performance using the recommended capacity.

Concluding the Engagement

The Power BI Premium engagement was a great success. Through the engagement, our client reduced the page load time of a critical report and eliminated data refresh failures. More importantly, our client’s business users gained a technical understanding of Power BI Premium. The business users’ improved understanding of capacity metrics allowed them to approve Premium decisions with confidence.

Monday, July 8, 2019

Case Study: Over 50% Faster Load Time with Power BI Premium Optimization



Key Challenges

   Optimize report design.
   Load test Power BI reports.
   Implement DAX optimizations.

Leveraging Power BI Premium

As part of our Power BI Premium engagements, we work with our clients’ IT departments to better leverage Power BI Premium capabilities. We optimize report design, provide modeling suggestions, implement DAX optimizations, review Premium configuration, review architecture design, and provide recommendations on capacity planning.

Reducing Page Load Times

Our client, a global provider of water, hygiene, and energy services, needed to improve their Power BI Premium performance. Before we optimized their report design, the default Any Measure view required 10 seconds to load. Drilldown also required significant amounts of time. Drilling down to the Any Measure subarea required 15 seconds. Drilling down to the Sold To view required one minute. Drilling down to the Program view required two and half minutes, and the Solution view required four and a half minutes.

To reduce page load times, we optimized our client’s report design. We removed redundant visuals from pages. We replaced the custom visuals with out-of-box visuals. We also implemented drill through and drilldown options. To eliminate unnecessary interactions between visuals, we removed interactions between slicers. We used bookmarks to hide visuals during the default load. Lastly, we used the New Filter Experience. New Filter Experience moved multi-select slicers to filters and enabled the apply filters option.

Our report design optimizations substantially reduced page load times. Following the optimizations, the Any Measure default view loaded instantly. Drilldown load times also improved. Overall, our report design optimizations reduced page load times by 57%.

Stress Testing Power BI Reports

Our Power BI Premium engagement did not end with report design optimizations. To improve data modeling, we removed the bidirectional filter. We also implemented DAX optimizations for variable caching and the calculate and if functions. Finally, we performed load testing using a PowerShell script and Visual Studio.

Load testing our client’s Power BI reports presented a unique challenge. Because powerbi.com is load balanced, we could not directly stress test our client’s Power BI reports. To evaluate how the reports would perform under stress, we load tested our client’s backend. We fired 1,500 queries at the backend server. The test provided our client with a general understanding of report performance.

Results of a Successful Engagement

Our Power BI Premium engagement resulted in multiple benefits for our client. Our report design optimizations reduced page load times by 57%. The backend load test demonstrated how Power BI reports performed under stress. The entire engagement provided our client’s IT department with further insight into the capabilities of Power BI Premium.

Wednesday, July 3, 2019

Case Study: Azure Real-Time Reporting Improves Managed Partner Performance



Key Challenges

   Reduce data refresh time.
   Reduce partner verification time.
   Improve managed partner performance.

Managed Partners Drive Need for Real-Time Reporting

We recently transferred a large software supplier’s partner sales attribution system to the cloud. As part of the transition to the cloud, we implemented Azure real-time reporting. Real-time reporting improved managed partner efficiency and powered a seller dashboard.

Our client derives over 90% of its business revenue from partners. Managed partners form less than 10% of our client’s partner network but account for an outsized contribution to business revenue. Before we implemented real-time reporting, 24-hour data refresh cycles slowed business decision-making. The slow refresh cycles hindered business decisions, reducing sales opportunities. Partner status updates also required a 24-hour cycle. Managed partners needed the ability to verify updates immediately. To improve opportunities for managed partners, our client needed real-time reporting.

Improving Data Processing Times

Our client’s original managed partner setup relied on on-premise servers. The servers required long refresh cycles. The system processed data from a CRM portal into an Azure SQL database. Managed partners viewed the processed data in Power BI. The entire process required roughly 11 hours.

The original seller dashboard stored sales data using Azure Data Lake Storage. The system pulled the data from on-premise servers and processed data into a unified format. The system then published the data to the Power BI seller dashboard. The entire process required 40-45 minutes each day.

Figure 1: Managed partners on-premise reporting

To achieve real-time reporting, we moved our client’s managed partner data to the cloud. With the cloud implementation, the reporting process now only requires two minutes. In the cloud reporting implementation, upstream data flows from the CRM portal to a staging layer. An Azure Data Factory (ADF) pipeline continuously checks for new data. If the pipeline detects change, the changed data is compared to the entire data set. The changed data is then posted from an Azure SQL Database to Power BI reports using direct queries. Direct queries decrease the time required to load data onto the managed partners’ Power BI reports.

Moving the seller dashboard to the cloud reduced the data refresh and publish time by 66%. The seller dashboard cloud implementation pulls cloud-based data to an on-cloud Azure server. Machine learning algorithms run on Azure Databricks notebooks to process the data. Finally, the system publishes the data to the Power BI seller dashboard. The entire process requires 15 minutes. The cloud-based seller dashboard saves time and resources with cloud to cloud data transfers and Azure Databricks ML processing.

Figure 2: Managed partners real-time reporting

Benefits of Real-Time Reporting

Azure real-time reporting resulted in numerous benefits for our client. We reduced reporting time to two minutes for managed partners and 15 minutes for seller dashboards. As a result, our client can now provide real-time analysis to managed partners. We also improved processing speed. Certain code segments now run ten times faster. Improved processing speeds directly improved business performance. Managed partners now rely on up-to-date data, improving decision-making, ultimately leading to better sales opportunities.

Tuesday, July 2, 2019

Case Study: Reducing Costs by Automating Privacy Matters



Key Challenges

   Automatically direct privacy requests to the correct regional team.
   Automatically store privacy requests in chronological case files.
   Enable new team members to review stored privacy request records.

Protecting Private Information

With laws such as the GDPR, privacy matters are increasingly critical for large corporations. Companies with thousands of team members raise hundreds of privacy issues daily. For example, a privacy issue arises with a task as simple as listing a name and phone number on a public document. To avoid liability, privacy teams must evaluate any shared personal or company details. Small (less than 20 person) privacy teams must evaluate the requests efficiently at a large scale.

Annual audits are also a crucial matter for privacy teams. Auditors randomly select privacy cases from the previous year. The privacy team must provide details regarding their responses to the matters. It is crucial that privacy teams maintain chronological records of privacy requests and responses.

Automating with Dynamics 365 and SharePoint

We recently implemented an automated privacy tracking system for a large software company. Our client previously relied on an Outlook-based system. When a team member had a privacy concern, he sent an email directly to the privacy team. A privacy team member received the email, and manually added topic and privacy region (United States, Europe, Asia-Pacific, etc.) tags. The privacy team then manually grouped the emails into case files. The client’s Outlook-based system was cumbersome and inconsistent.

Our client asked us to automate privacy requests using Dynamics 365 and SharePoint. We agreed on several objectives at the start of the project:

   Simplify the request process by accepting requests via email and via an online portal.
   Improve efficiency by automatically directing privacy requests to the correct regional privacy team.
   Reduce workload by automatically generating case numbers for privacy requests.
   Ease team member onboarding by creating a chronological communications record for each privacy request.
   Improve annual audits by tracking privacy case resolutions.

Web Portal and Email Submission

After determining the privacy tracking system objectives, we began implementation. We designed two methods to submit requests: a web portal and an email submission system. First, we created the web portal. The web portal consists of a structured form that gathers all the information the privacy team requires. The form requires details including:

   Requester name
   The applicable regional privacy team
   Users to copy on the request
   Request title
   Request description
   Attachments

The form’s attractive interface lets users easily include all the applicable information for the privacy team. When a user submits the form, Dynamics 365 business logic determines the appropriate team to direct the request to. At the same time, the system creates a support ticket and case number. The system stores attachments and supporting documents in a correlating SharePoint record. The system tracks all privacy team replies and subsequent communications in Dynamics 365 automatically.

The privacy web portal also allows users to tag a request as a “privacy incident.” A privacy incident is a matter that requires immediate attention, such as a privacy breach. The privacy team prioritizes privacy incidents above all other matters.

The email privacy system request is more complex. Email does not allow the same customized structure as a web form. Our team compensated for the lack of structure by aligning email fields with the web form structure. The “From” line indicates the requester name. The “To” line indicates the applicable regional privacy team. The “CC” line indicates who to copy on the privacy request ticket. The “Subject” line indicates the request title. And the email body indicates the request description. Just as with the web form, Dynamics 365 business logic determines where to direct the submitted email request.

Users can include attachments with the email request, which the system tracks in a correlating SharePoint record. The email requests also generate a support ticket and case number, similarly to the web form requests.

Immediate Automation Benefits

The privacy request system yielded three immediate benefits. First, chronologically logged case files eased audit concerns. Each case file now shows all the information related to the matter and how the team resolved it. The teams no longer worry about manually sorting information. Second, new privacy team members can catch up to speed on cases immediately. With all the case information automatically organized, team members pick up existing case files with ease. Finally, effort is reduced because team members no longer must manually tag incoming privacy requests. The privacy request system’s structured forms always route requests to the correct team.

Additional benefits include:

   Automatic responses and notifications for ticket creation and closure.
   Secure access to content from mobile devices.
   Centralized deployment and upgrades.
   Support ticket statistic tracking with Power BI.
   Automatic SLA tracking for privacy request cases.

Teams began using the automated privacy request system immediately. Within two weeks of implementation, teams submitted over 800 privacy request tickets. The system continues to improve efficiency as the client rolls it out to more teams.

Friday, June 28, 2019

Case Study: Migrating a Partner Sales Attribution System to the Cloud



Key Challenges

   Improve data processing time.
   Increase scalability and on-demand availability.
   Improve infrastructure management.

Moving to the Cloud

We recently implemented a partner sales attribution system for a large software supplier. The partner sales attribution system successfully accommodated over 20,000 daily visits (3,000 users daily) and consolidated data from over 50 sources. But the system faced challenges in four areas:

   Slow data processes and refreshes.
   Poor server performance.
   Inefficient infrastructure management.
   Inefficient infrastructure patches.

To overcome the challenges, we decided to transition the sales attribution system to the cloud.

Creating the Cloud Architecture

To solve the four challenges, we studied cloud assets and executed a series of proof of concepts. We explored Azure Data Lake Storage, Azure Data Lake Analytics, Azure Analysis Services, Azure Data Warehouse, and Azure SQL Database. We conducted the proof of concepts to evaluate the effectiveness of the assets for our requirements. We benchmarked the results to the existing on-premise setup. We then conducted a cost-benefit analysis of the proposed cloud migration.


Following the proof of concept stage, we began the platform migration. We staged the data in Azure Data Lake Storage (ADLS) using Azure Data Factory. We processed the data using Azure Databricks. The data then moved to Azure Data Warehouse (ADW) for downstream users. We applied internal processing requirements with security-based access principles to the dataset. We also implemented a just-in-time (JIT) processing framework to process tabular models on top of ADLS and ADW. The framework enabled the independent integration, processing, and publishing of the assets to the reporting layer. The reporting (mart and tabular) landscape enabled data scientist testing and catered to Power User requirements via Power BI and Excel. Users could also export data automatically to PowerPoint presentations, Outlook, and Excel reports.

Our cloud architecture saved money and surpassed client expectations. The distributed processing power of the cloud-enabled faster processing. The serverless architecture eliminated server dependency among the streams. The cloud’s serverless, scalable, and distributed architecture reduced infrastructure management costs. The distributed architecture also improved infrastructure patches.

Migration Benefits

Moving to the cloud resulted in four advantages:

   Reduced data latency.
   Reduced maintenance, execution, and mounting costs.
   Near real-time reporting.
   Comprehensive cloud analytics.

Reduced Data Latency: Our client critically needed reduced data latency. To respond to changing market conditions, our client needed fresh data. Low data latency allows organizations to execute strategic business decisions more quickly. Low data latency also allows employees to generate ad hoc reports with up-to-date data.

Our client’s partner sales attribution system serves over 60,000 partners every day. 35 terabytes of data pass through the system each day. With the on-premise sales attribution system, data required anywhere from 18 to 26 hours to update. By introducing three major technical improvements—a master pipeline architecture, JIT processing framework, and in-house Azure Cloud best practices—data now requires only 4 to 12 hours to update. The master pipeline architecture enables modular and independent data refreshes. The JIT processing framework improves data processing speed and latency. The in-house Azure Cloud best practices refined the architecture and implementation. The best practices also improved the DataOps team’s monitoring and debugging capabilities.

Optimized Infrastructure: Moving to the cloud resulted in significant cost savings. The client’s former on-premise system incurred significant maintenance, mounting, and execution costs. The new cloud architecture adopted a pay-as-you-go structure. Instead of provisioning unnecessary resources, we implemented an efficient architecture using minimum resources. Our client reduced infrastructure costs by 15% and improved data latency by 60%.



Real-Time Reporting: Cloud assets allowed our client to selectively implement near real-time reporting for time-sensitive reports. These reports included a suite of sales performance reports. The sales performance reports provided our client’s executive team with an up-to-the-minute understanding of sales performance across the company’s major sales divisions.

Comprehensive Cloud Analytics: Currently, we are exploring analytics options for the partner sales attribution system. Cloud analytics offers our client three advantages. First, cloud analytics facilitates strategic business decisions. Cloud analytics neatly categorizes vast quantities of data, which is then available to users via their web browsers. Users drive their businesses using insights derived from the data. Second, cloud analytics provides a landscape for data scientists to derive additional insights. Cloud assets already deliver data, but cloud analytics enables data scientists to exhaustively test their own hypotheses. Third, cloud analytics enables platform-independent support for BI tools. Cloud analytics supports Power BI, Excel, and power users.

The cloud migration of our client’s partner sales attribution system concluded with significant gains for our client. Refresh cycles and data availability efficiency improved 2.3 times. Data consistency benefited from the newly created single source of truth for all the reporting layers. Near real-time data achieved a 15 to 20-minute publishing cycle for referral data. Infrastructure costs fell after automating maintenance, security compliance, alert monitoring, and updates.

Thursday, June 27, 2019

Case Study: Data Validation Framework Reduces Support Tickets



Key Challenges

   Improve data quality.
   Reduce number of support tickets.
   Automate data validation and email alerts.

Validating Data

When using large amounts of data from multiple sources, data discrepancies and data loss can occur. Data discrepancies and data loss decrease the effectiveness and accuracy of data systems. To combat data discrepancies and data loss, users can submit support tickets for mismatched or missing data. Resolving support tickets for mismatched and missing data requires significant time.

Our client needed to reduce the number of support tickets regarding mismatched or missing data for their partner sales attribution system. Issues regarding mismatched or missing data accounted for over 20% of our client’s support tickets. With their previous solution, our client validated data during the last stage of their architecture. Our client validated the data using build verification tests (BVTs) and manual data validation. Because data validation only occurred during the last stage of the architecture, data problems from early stages compounded throughout the entire system. Our solution needed to improve the data validation process.

Our Approach

We implemented data validation mechanisms at each stage of the architecture. The architecture's stages included the staging, mart, cube comparison, and so on. Validating the data at each stage ensured developers could track anomalies. To track anomalies, we enabled earlier corrective action. We anticipated that earlier corrective action would lead to reduced support tickets.
  
To validate the data, we import data from multiple repositories into our system. The system processes the data using three parameters: BVTs, KPIs, and row counts.

Our data validation approach using BVTs includes four steps. First, we gather a list of all test cases from the corresponding database. Second, we run the testcase queries using multi-threading. Then, we compile the results into a single table. Finally, we send the results to the developers in an alert email.

To validate row counts, we use a different approach. We fix a threshold value for the number of rows. The threshold value ensures the number of rows does not change dramatically during data processing. The threshold value is dynamic. The process generates different threshold values for different tables depending on historical trends and the amount of data. For example, tables with less than 1,000 rows cannot exceed a five percent threshold. Tables with greater than 1,000,000 rows cannot exceed a one percent threshold.

Validating data against project KPIs required another distinct approach. At the beginning of the project—to measure the project’s success—we defined KPIs with our client. We execute the KPIs at multiple stages of the architecture. To ensure data remains aligned with the ultimate KPIs, we validate the data against previously defined KPIs .

Results: Improved Reporting Time, Better Data Quality, and Reduced Support Tickets

The data validation project concluded with great results. Previously, one of the streams required 24 hours to publish reports after receiving upstream data. Now, the stream publishes reports in only six hours—four times faster! Other streams saw similarly impressive results. In addition, data quality has significantly improved. The total percentage of user tickets raised for mismatched or missing data has declined from 20% to 15%.