November 18, 2024

Enabling self-service for supply chain analytics with Microsoft Fabric and Direct Lake

  












Effective supply chain analytics is critical for success. Especially for retail companies, an efficient supply chain helps meet customer demand, optimize inventory, and ensure timely product delivery. It directly impacts cost management, operational efficiency, and customer satisfaction too. A well-managed supply chain also enables faster response to market trends, helping companies stay competitive and adaptable in a dynamic retail landscape. 

Many technologies enable advanced supply chain analytics offering unique capabilities in data integration, processing, and real-time insights. Among these, Snowflake is one of the key platforms. This is due to its powerful data-sharing capabilities, high-performance computing, and seamless integration with BI tools—making it valuable for optimizing and scaling supply chain operations.


 

About the scenario

A grocery retailer undergoing digital transformation aimed to empower its supply chain team with data-driven decision-making. The company selected Microsoft Fabric, a robust data platform designed to deliver efficient self-service analytics. Using Microsoft Fabric’s capabilities and AI (artificial intelligence) tools like Microsoft Copilot, the retailer gained faster data access and additional scalability. This approach fosters operational efficiency, providing a low-maintenance solution that delivers reliable insights and supports seamless data processing.


Key business requirements

The migration process included the following aspects:

·              Self-service analytics: Empowers non-technical users to generate insights.

·        Comprehensive data coverage: Includes data from the last three years at highly granular levels (~5 TB).

·        User support: Accommodates 250+ users with tailored data access and permissions for different roles.

·        Generative AI integration (Copilot): Simplifies insight discovery with tools like Microsoft Copilot.

·        Scalability and performance: Scales to meet increasing demands without compromising performance.

·        Security and data access control: Enforces role-based access to protect sensitive data.

·        Integration with multiple data sources: Unifies data from over 20 sources for comprehensive analysis.



Unfolding the solution

Choosing Direct Lake

Selecting the right storage mode was essential for self-service analytics. Direct Lake was selected as it provided high-speed, real-time data access from Delta tables in Microsoft OneLake. This choice eliminated data imports and frequent refreshes, allowing analysts instant access to large datasets. Direct Lake’s unrestricted dataset size in Microsoft Fabric SKUs (stock keeping unit) supported anticipated data growth. Integration with Power BI, Excel, and other tools was another benefit, offering immediate insights.


Conducting proof of concepts (POCs)

A proof of concept (POC) evaluated the feasibility of migrating the client to Microsoft Fabric’s Direct Lake model. Essential elements like calculated columns, static tables, row-level security (RLS), and time intelligence were replicated. This phase involved mapping approximately 180 relationships and defining around 400 measures. Azure Databricks (ADB) notebooks handled calculated tables and columns, while Azure Data Lake Storage (ADLS) shortcuts created a Lakehouse. Direct Lake semantic models supported Power BI reports. As part of this implementation, we also evaluated aspects relating to security, time intelligence, and refresh latency. A 50+ concurrent user load test was also conducted.

The POC successfully met all requirements and passed all tests.

Figure 1: Architecture


Solution design: A lake-centric approach

Following the POC’s success, a solution was built using a lake-centric architecture, centered on Microsoft Fabric and Direct Lake. Key components included:

1.     Capacity planning and SKU selection: The F256 SKU was chosen to support three years of data while meeting performance needs. Direct Lake’s lack of dataset restrictions made it more cost-effective than import mode, which would require a higher SKU.

2.     Improved data access with self-service and AI integration: The solution enabled users to access insights and information faster with Smart Narratives and Copilot chat. Users are now eager to start using AI Skills.

3.     Automated operations and cost efficiency: Azure DevOps CI/CD ensured reliable deployments, while Microsoft Fabric notebooks automated dataset caching. Scheduling optimized Microsoft Fabric capacity, scaling to F256 during business hours and down to F64 during off-peak times.

4.     Data storage: Azure Data Lake Storage Gen 2 serves as the primary data storage. ADLS shortcuts were chosen due to their high performance and fast processing times.

5.     Semantic model mode: Direct Lake mode was selected for the semantic model. This is because Direct Lake mode supports scalability without dataset restrictions, accommodating future data growth needs.

6.     Security and access control: Using 'workspace identity' for ADLS shortcuts eliminated key management, while RLS provided fine-grained control over data access.

7.     Performance optimization: Pre-warming techniques reduced first-load times. Testing demonstrated improved performance with ADLS shortcuts over managed tables. 



Solution impact 

This solution enables both technical and non-technical users to make data-driven decisions with ease. Users access operational data through Excel, Power BI, and Copilot’s chat-based interface, building models for demand forecasting, inventory optimization, and delivery route planning. Users can work with granular data efficiently, experiencing reduced refresh latency and faster load times. The solution eliminated manual tasks, enabling seamless work with live data. All in all, the solution boosts supply chain responsiveness and agility.



Interested in learning more?

As a Microsoft Fabric Featured Partner, MAQ Software brings deep expertise in helping organizations unlock the full potential of Microsoft Fabric. Whether you're looking for guidance on implementing data solutions or optimizing your existing platform, we’re here to support you every step of the way. Contact us at Sales@MAQSoftware.com or explore our apps or consulting services on Microsoft Azure Marketplace:

Real Time-Intelligence in Microsoft Fabric for ISVs

   












Having access to accurate, real-time data can drive better decisions and provide competitive advantages, especially in this dynamic business environment. Microsoft Fabric, with its Real-Time Intelligence service, offers a solution that processes, analyzes, and acts on data as it flows. This service delivers actionable insights instantly. It is transforming how businesses manage data by simplifying architectures, improving scalability, and greatly reducing costs.

In this case study, we explore how Real-Time Intelligence within Microsoft Fabric transformed a client’s data ecosystem in the construction industry. It overcame challenges related to performance, cost, and scalability. By transitioning to Microsoft Fabric’s unified platform, the client achieved faster decision-making and greater efficiency. They now have a scalable solution for future growth, demonstrating the power of Microsoft Fabric’s Real-Time Intelligence.


 

About the client

The client, an independent software vendor (ISV), provides solutions for every stage of the construction project lifecycle. This includes everything from the design stage to post-construction analysis. Known for bridging the gap between physical and digital workflows, the client relies on on-premises and cloud-based technologies. For their customers, having immediate, accurate data is critical to ensuring that projects remain on schedule and within budget. However, their existing system was fragmented, costly, and lacked scalability.



Overview of existing data model

The client’s existing system was a multi-tenant solution comprising of:

·        A SQL Server that captures all updates in real time.

·        Debezium (Change Data Capture) and Kafka for streaming real-time changes from the SQL Server database.

·        Snowflake, which ingests data from Kafka, used for implementing business logic and processing data.

·        Power BI, configured with DirectQuery on the reporting view layer in Snowflake, enabling data reporting capabilities.

While functional, this setup created inefficiencies that impacted the client’s ability to deliver fast and cost-effective solutions to customers.

Figure 1: Existing architecture



The challenges

The existing system faced several operational, scalability, and performance challenges:

·        High total cost of ownership (TCO)

o   Platform and server maintenance: The multi-tenant SQL Server + Debezium + Kafka setup was replicated for each customer. This resulted in high operational and ownership costs for platform maintenance, updating, and managing these separate services.

o   License expenses: Separate licenses for SQL Server, Debezium (virtual machine hosting), and Kafka were also required for each customer. Power BI Pro licenses were required for every user, increasing costs further.

·        Slow performance and data latency

o   Delayed insights: Processing an average of 180K–200K records per minute caused increasing latency in real-time updates. This delayed access to critical data for leadership and decision-makers. Peak record volumes worsened this issue, increasing delays and costs.

o   Report latency: The complexity of business logic and growing data volumes degraded the performance of DirectQuery (DQ) reporting.

·        Data processing

o   Delta merging challenges: Integrating the latest streamed records (delta) with historical data for analytical purposes was not performing as expected.

o   Managing DML operations: While updating records in Snowflake, certain INSERTs and DELETEs failed to execute, potentially due to the high data volume.

·        Need for a unified data hub

o   Isolated real-time streaming: The real-time streaming implementation operated in silos and was not available as a unified data hub accessible to all teams.

o   Dependency complexities: The system consists of numerous scripts and tables for data orchestration and business logic implementation. This made debugging difficult due to interdependence across levels.



Key business requirements

To address current limitations and support future growth, the client needed the solution to meet the following key requirements:

·    High throughput requirement: The proposed solution should be scalable enough to handle around 5,000 records per second.

·    Report page load times: Page load time for reports should be minimized to 5-8 seconds, with up to 10 seconds for drill-down views.

·    Resilient processing: An error-handling mechanism is needed to simplify debugging and allow processes to resume smoothly after failures.

·    Role-based access control: Row-level security (RLS) needs to be applied to Power BI reports at the enterprise level to ensure secure data access.

·    Cost-efficient solution: The proposed system should incur lower costs compared to the existing Snowflake setup.



The core of our solution: Microsoft Fabric Real-Time Intelligence

To address these challenges, the client migrated to Microsoft Fabric. They used its Real-Time Intelligence service to unify and optimize their data pipeline. With Microsoft Fabric’s SQL Server Change Data Capture (CDC) connector and Eventhouse, the client achieved seamless real-time data processing.

This integration greatly increased data processing speeds, reduced operational complexity, and lowered costs. The Real-Time Intelligence service within Microsoft Fabric provided unparalleled real-time insights. It empowered the client to make proactive decisions while ensuring scalability for future growth.



Diving into the details

Figure 2: Solution architecture


The solution was designed with key components to optimize performance:

·    High throughput management: Real-time ingestion with the SQL Server CDC connector eliminated the need for Debezium and Kafka. This significantly reduced TCO. (Step #1)

·    Real-time data streaming: Changes are streamed in real-time through event stream, ensuring continuous data flow and timely updates. (Step #2)

·    Real-time analysis with Eventhouse (Kusto Query Language): Eventhouse used Kusto Query Language (KQL) for fast and reliable real-time querying and business transformation. This provides end users with fast and reliable insights. (Step #3)

·    Real-time alerting and notifications: Reflex provided alerts and notifications based on specified conditions in real-time data. (Step #4)

·    Scalability and ease of analytics: One-click data availability in OneLake supported analytical workloads. Direct Lake integration with Power BI enabled near-instant visualizations. (Step #5)

·    Unified platform: All data is now stored/processed in a unified environment, eliminating the need for separate Snowflake and Power BI licenses. (Step #6/7)



Measurable business impact

Figure 3: Solution benefits

The migration to Microsoft Fabric delivered measurable improvements:

·    Data transformations

o   Real-time data transformations via Eventhouse: Processing time is reduced to ~1 minute after applying required business transformations.

o   Near real-time scenarios via OneLake shortcuts: Transformation and merging of real-time data with years of historical records are completed in as little as ~4 minutes.

·    Page load time in Power BI: Optimized to 7-9 seconds with Direct Lake access for summary and drill-down views, providing a faster user experience.



In conclusion

By consolidating systems into a single platform with Microsoft Fabric Real-Time Intelligence, the client achieved a transformative solution. They reduced costs, eliminated scalability limitations, and minimized data latency. Microsoft Fabric’s Real-Time Intelligence service empowered them to harness real-time processing and provide faster insights. This unified platform allows effortless customer onboarding and scaling of their business. All in all, this solution is well suited for ISVs who have multi-tenant architecture with customer-specific app databases.



Interested in learning more?

As a Microsoft Fabric Featured Partner, MAQ Software brings deep expertise in helping organizations unlock the full potential of Microsoft Fabric. Whether you're looking for guidance on implementing data solutions or optimizing your existing platform, we’re here to support you every step of the way. Contact us at Sales@MAQSoftware.com or explore our apps or consulting services on Microsoft Azure Marketplace:

Transforming data operations with Microsoft Fabric and CI/CD for ISVs

   












In today’s fast-paced tech landscape, data drives decision-making and innovation. Migrating organizational data to Microsoft Fabric is a transformative move. Microsoft Fabric provides a robust platform that integrates seamlessly with existing data sources, creating a unified, scalable data environment. Using Microsoft Fabric brings multiple advantages, including unified Delta data format, data democratization, integrated AI (artificial intelligence), and simplified security management.

 


About CI/CD

Building software goes beyond writing code. Continuous Integration and Continuous Deployment (or Delivery)—known as CI/CD—form essential practices for modern tech stacks, providing a superior alternative to manual deployments. CI/CD enables a streamlined process, delivering faster, cleaner, and more reliable software releases.

Advantages of CI/CD

·        Accelerates release cycles and time-to-market: CI/CD automates integration, testing, and deployment, enabling rapid response to customer needs.

·        Ensures a consistent and reliable process: Automated testing and deployment improve software stability and consistency, reducing errors.

·        Scales to manage environments effectively: CI/CD simplifies managing multiple environments, allowing seamless scaling as projects grow.

·        Boosts developer productivity: By reducing manual tasks, CI/CD lets developers focus on coding, leading to faster development and more efficient teams.



How Microsoft Fabric enables CI/CD

·        Built-in Git integration: Sync Microsoft Fabric workspaces with a Git branch (Azure DevOps or Git) for streamlined version control and collaboration.

·        Microsoft Fabric deployment pipelines: Out-of-the-box deployment pipelines support 2 to 10 stages, enabling flexible deployment across multiple workspaces. Sync code forward or backward across stages to ensure consistency.

·        Microsoft Fabric REST APIs: Automate tasks using REST APIs (application programming interface) that support user accounts, managed identities, or service principals (SPNs) for seamless, secure operations.



Implementation for ISVs

Promoting solutions and managing deployments for ISVs

ISVs (independent software vendors) have a few options to promote their solutions to customers:

·        Deploying multi-tenant applications within the ISV tenant for customer use.

·        Publishing Microsoft Fabric Apps.

·        Publishing Microsoft Fabric workloads.


In this case, an ISV deployed a multi-tenant application using Microsoft Fabric’s Platform as a Service (PaaS) capabilities, delivering Software as a Service (SaaS) to customers. Microsoft Fabric allows the ISV to provide reliable software, with CI/CD enabling smooth updates across hundreds of customer workspaces. Beyond version control, automated testing, and Infrastructure as Code (IaC), ISVs require:

·        Multiple environments and configurations.

·        Scalable pipelines.

·        Robust monitoring and observability.

·        Granular control over artifacts deployed to each customer workspace.


Microsoft Fabric REST APIs addressed these needs, supporting streamlined, controlled deployments.



Infrastructure as code

Treating workspace items as building blocks is essential for solution implementation. Each customer workspace is configured based on data profiles, optimizing resource use. A central Lakehouse stores these configurations, and resources are provisioned accordingly. Workspace provisioning uses a template and configuration table stored centrally in the Lakehouse. The infra team provisions workspaces using REST API calls, creating custom pools for customers and deploying environment objects as needed.



Code deployments

Figure 1: Architecture

Infrastructure as code

The deployment approach is similar to trunk-based branching strategies. For ISVs where granular item control is required, Microsoft Fabric REST APIs are the best way to manage deployments. The deployment process includes:


1.     Merging pull requests from feature branches to the main branch.

2.     Building, testing, and triggering a release pipeline to deploy to the Dev workspace. After deployment, a post-deployment call updates item configurations in Dev.

3.     Repeating the process in QA (quality assurance) workspaces to simulate production.

4.     Deploying to production workspaces after testing. Each customer has a unique workspace, allowing parallel deployments. Post-deployment API calls handle resource mapping for artifacts.


Notes:

1.     Lakehouses are created blank, with tables and shortcuts generated automatically via a notebook execution API.

2.     Parameterization, like Azure Data Factory, is currently unavailable in Microsoft Fabric. Connections are resolved through autobinding if Lakehouses are provisioned before pipeline deployment.

3.     Connections are currently not Microsoft Fabric objects, requiring manual setup during deployment.



What’s next

While CI/CD processes are implemented as outlined, further improvements could streamline it:

1.      Configuration variables as Microsoft Fabric items: Adding configuration values as Microsoft Fabric items would simplify and standardize settings.

2.      Azure Data Factory-style templatization: Managing connections as Microsoft Fabric items would improve flexibility.

3.      SPN support for all APIs: Adding SPN authentication across all APIs would align with production best practices.

4.      Support for additional Git providers: Expanding to GitLab and Bitbucket would increase Microsoft Fabric’s adoption by aligning with existing workflows.



Interested in learning more?

As a Microsoft Fabric Featured Partner, MAQ Software brings deep expertise in helping organizations unlock the full potential of Microsoft Fabric. Whether you're looking for guidance on implementing data solutions or optimizing your existing platform, we’re here to support you every step of the way. Contact us at Sales@MAQSoftware.com or explore our apps or consulting services on Microsoft Azure Marketplace: