Thursday, June 16, 2022

Enhancing fintech analytics to provide millions of borrowers with better loan options

Business Case

Our client, a leading fintech company, enables thousands of financial institutions to engage millions of borrowers with better loan options. Our client was on a mission to expand their analytics platform when they faced a critical block: Their existing platform architecture was at maximum data capacity. To onboard new customers, our client needed a more scalable analytics solution. In addition, our client wanted to enhance their platform’s reporting experience. Existing reporting was limited and required users to export data to Excel for manual analysis, delaying insights. To increase their product value and onboard more customers, our client needed a scalable architecture with embedded reporting.

Key Challenges

  Enable analytics platform to scale to 1000+ customers 
  Enable self-serve, near real-time analytics 
  Enable AI/ML capabilities for future innovation 
  Improve security of financial data 

Our Solution

We rebuilt our client’s analytics platform using Azure Synapse, Azure Data Lake Storage, Azure Data Factory, Azure Databricks, and Power BI. To ensure operational and technical excellence throughout the build, we followed the five pillars of the Azure Well-Architected Framework and leveraged migration strategies from Microsoft’s Cloud Adoption Framework.

Reliability: Implemented query replica within Azure Analysis Services (AAS) to ensure resource intensive queries do not impact ETL processing. Configured secondary and backup resources to ensure 100% resource availability.

Security: Enabled role-based access, disabled public access to storage accounts with PII data to ensure partner data is isolated within the ecosystem. In doing this, we greatly reduced the risk of security threats.

Cost Optimization: Implemented auto-scaling in lower environments, enabled Databricks to scale down when inactive, and deployed Power BI report on cost monitoring to scale services as needed.

Operational Excellence: Created Terraform automated scripts for Azure resources deployment. Implemented proactive monitoring for pipeline bottlenecks, ETL execution, and failures.

Performance Efficiency: Implemented parallel processing and concurrent querying of underlying data model for 1000+ customers using Azure Databricks.

In addition, our automated deployment framework uses continuous integration/ continuous delivery (CI/CD) pipelines to create Azure landing zones by focusing on identity, network, and resource management. To deploy Azure landing zones, we used a proprietary approach that combines the benefits of both the “start small and expand” and the “enterprise-scale”. Using industry-standard best practices and our center of excellence for Azure infrastructure setup, we ensured the right configuration to build a strong foundation and avoid rework in the future. This approach reassures our customers about our capabilities while creating a secure and reliable environment that is built to last.

Business Outcomes

With our Azure Synapse-based solution, our client’s platform now offers powerful self-service, near real-time analytics, enabling their customers to reach millions of borrowers faster. The platform now has the capacity to scale and support over 1000 customers. With Azure Synapse, our client can easily integrate machine learning models like fraud detection and recommendation engines without major architecture changes. To accelerate onboarding, we developed an automated deployment framework that onboards new customers in a single click, reducing setup time from days to hours.  

Friday, December 31, 2021

Accurately Forecast Customer Sales with Machine Learning (ML)

Business Case:

Our client, a multinational food and beverage chain, operates thousands of retail stores and generates billions of dollars of annual revenue. Our client needed to understand the impact of weather, promotions, discounts, product launches, holidays, and other events on sales. The client’s existing predictive sales model routinely underestimated sales volume at both the aggregated and daily level. Our client also needed to better understand the causes of seasonal and daily spikes in sales.

Key Challenges:

  Improve the accuracy of future sales predictions. 
  Identify and analyze patterns in data for nonlinear fitting and predict future sales using historical data. 
  Examine the correlation between weather data (precipitation, temperature, pressure, wind speed, cloudiness, and so on) and sales at a specific longitude and latitude. 
  Analyze the impact of factors such as product launches, promotions, discounts, and holidays on predicted sales. 
  Include seasonality variables to explain seasonal fluctuations in the sales time series. 

Our Solution:

We built a Sales Forecasting Engine on Microsoft Azure Databricks that allowed our client to quickly and accurately predict sales.

Solution Design:

We worked with the client’s marketing operations and finance teams to collect and analyze their sales data, promotion and discount data, and store events data. We also used National Oceanic and Atmospheric Administration (NOAA) historical weather data from the US government to develop the weather model. We extrapolated the historical data and used application programming interfaces (APIs) to connect the data to our machine learning (ML) model to predict weather.


  Used R libraries and custom functions to cleanse and preprocess the data. 
  Used descriptive statistical analysis to tackle skewness and kurtosis for the features. 
  Performed Fourier transforms to decompose sales, analyze trends, and remove noise from the sales time series. 
  Applied logarithmic, exponential, and S-curve transformations to features to introduce nonlinearity as per real scenarios. 
  Developed hybrid regression models to predict future sales using nonlinear, multiplicative, probabilistic, regularized, and deep learning approaches. 
Figure 1: Architecture of Forecasting Engine

Business Outcomes:

Our supervised ML predictive model empowered our client to analyze the impact of weather, promotions, discounts, product launches, holidays, and daily events on sales and execute business decisions accordingly. The model also identified the delay between an event and the seasonal spike, which enabled our client to maximize sales following an event. 

Our hybrid ML model is far more accurate than the previous ML model. The prediction runs on an aggregated and daily basis, and the model retrains itself once actual sales figures are injected into the model.

Our model’s Mean Absolute Percentage Error (MAPE) value was 0.09—as compared to the previous model’s MAPE value of 0.13. (a lower value indicates greater accuracy). 


    Forecasted sales depending on weather variations for the client’s store at a specific longitude and latitude.
    Analyzed the positive and negative impacts of daily events such as discounts, promotions, launch events, and holidays on predicted and actual sales.
    Statistically identified and explained seasonal spikes in sales time series.
    Identified the lag period for daily events to explain the behavior in time series.

Wednesday, November 10, 2021

Power BI performance factors: what impacts report performance?

There is little more frustrating than a slow-loading Power BI report. When you’re working with billions of rows and columns, it can feel like improving performance is impossible. At MAQ Software, we’ve worked with over 8,000 Power BI reports across a number of industries. In our experience, it is never impossible to improve your report performance. In fact, our goal for all report pages is to load within 8 seconds (at most).

So, how do we do it? By taking a structured approach. The first step is identifying the main areas that impact report performance. After all, to diagnose the problem, you need to know where to look. Typically, there are four major factors that affect Power BI performance: the data set, data sources, report design, and network issues.

The Data Set

Performance win: Reduce your data set size

The size and characteristics of your data set can drastically impact your final dashboard. You should ask yourself questions like what can you consolidate or eliminate? If you’re working with a lot of rows, do you need them all? What information is your business audience actually using?

Taking a good hard look at your data set size doesn’t mean you can’t work with big data sets – Power BI is absolutely designed to handle large volumes of real-time data. It’s about carefully identifying what you can keep and what needs to go. A few of the most common performance detractors within the data set include:

  • Whitespace
  • Null values
  • High column cardinality (i.e., columns with values that are very uncommon or unique, such as user names or user IDs)

Performance win: Optimize your data set model

You’re also going to want to look at your data model. When it comes to optimizing report performance, reducing the size of your model offers the best possible return on investment. The smaller your model, the faster it will run in the report. While different data sets require different models, there are a couple best practices you should follow that provide quick wins for performance:

  1. Use star schema
  2. Star schema is, by far, the best model to use in Power BI. In the star schema, dimension tables align with fact tables (giving it its eponymous star shape). Its alternative, the snowflake schema, uses subdivisions that represent an additional join in your queries. In Power BI, joins translate to slow loading. The fewer you have, the better.

  3. Turn off Auto Date/Time
  4. Auto Date/Time enables users to easily drill down into calendar time periods without having to implement a date dimension. However, this means that for every date column in your data, there is a hidden date/time table in your model. In large data sets, this adds up; your data model could end up massive and sluggish.

  5. Summarize metrics (where possible)
  6. Your raw data may pull information for daily, or even hourly sales, but do your end users actually need reporting at this level? If they only need to know overall monthly sales, you can significantly reduce your model size just by summarizing data by month rather than hour.

  7. Select the right dataset model for your data
  8. In general, the Import model offers the fastest performance thanks to its use of in-memory querying. The Import model imports data to a stored disk, so its query results are extremely fast (as long as the data is fully loaded into the Power BI memory). However, data models are not one-size-fits-all. If you need to work with data volumes that are too large to load complete into the model, or need to deliver real-time data, you should consider using DirectQuery.

Performance win: Optimize your measures

The efficiency of your DAX directly impacts the amount of time it takes a query to render data in a chart. Your best bet? Follow DAX best practices. Some quick wins you can implement today include:

  • Reducing the number of operations within your DAX
    • Before: Max Value:=IF(A>B, 1.1 * A, 1.1 * B)
    • After: Max Value:=1.1 * MAX(A,B)
  • Avoiding both-directional relationships in the data model (where both tables in a relationship cross-filter each other)
  • Moving row-level logic to Power Query (using M to calculate instead of DAX)
  • Avoiding floating point data types
  • Using Divide instead of \
  • Creating a flag in the table instead of having multiple values in a single IN clause (we improved performance by ~2 seconds with this change)
    • Before: Measure 1:= IF (Status IN { "Open", "Closed", "In-Progress"},[Actual], [Target])
    • After: Measure 1:=IF (Status = 1,[Actual], [Target])

Data Sources

Performance win: Consider the cloud

The type of data source you connect to your reports affects report performance. One of today’s hottest topics, especially with the advent of hybrid work, is the cloud. More and more businesses are relying on cloud reporting to share insights across the world. While each organization has to customize their system to fit their needs, we’ve seen some incredible performance wins through cloud-based reporting. In one scenario, we reduced a client’s data processing time from half an hour to two minutes by migrating them to the cloud.

Performance win: Track your actual needs instead of your assumed needs

Ask yourself:

  • Are you using a Tabular model or cube and why?
  • Do you have Geo replication enabled (and do you need to)?
  • Are you using load balancer (and should you be)?
  • Is the data configuration correct?

One of the most important things to consider is the end user’s experience. You don’t necessarily need everything to load fast. You just need to ensure users can quickly access the information they regularly rely upon.

For example: Power BI maintains a cache for dashboard tiles. Pulling data from the cache is faster and more reliable than querying the data source itself. If your users primarily need at-a-glance information, you can make your dashboards the user landing page and pin the most-used visuals. This way, you’ll deliver better user experience at a fraction of the performance cost.

Report Design

Performance win: Filter your data

We all know that report design impacts user experience, but it also has a noticeable effect on report performance. After all, the more data each visual needs to display, the slower the visual will load. Design-wise, there are a couple of big-ticket items to watch out for.

Avoid using unfiltered data. Usually, users don’t need every single row and column of every table every time they open a report. Use Top N filter to reduce the maximum number of items displayed in the table. This reduces the load on the report, improving performance.

You should also be careful when it comes to slicers. Slicers are a great way to help users navigate data, but they tank your report performance. This is because slicers always generate two queries: one to fetch data and one to fetch selection details. If you absolutely need to include slicers, use the Filter pane to evaluate which slicers are used most often, and implement only those.

Performance win: Limit your visuals

Using too many visuals in a single report turns report performance into a slog (and makes your reports difficult to read). Be mindful about which visuals you implement. In general, you should use the following Power BI performance guidelines:

  • Maximum number of widgets: 8
  • Maximum number of grids: 1
  • Maximum number of tiles: 10

Not all visual types perform the same. Grids, for example, are a massive drain on resources, while cards are a much more efficient information delivery system. To optimize the performance of your reports, you should limit each report page to a maximum of 30 total points using the following scoring system:

  • Cards: 1 point each
  • Gauges: 2 points each
  • Charts: 3 points each
  • Maps: 3 points each
  • Grids: 5 points each

Some visuals are also more efficient than others. Out-of-the-box visuals are traditionally faster than custom ones, as they’ve been vetted and created by the Power BI team. However, custom visuals have their merits, especially if you’re looking for something niche. When using custom visuals, prioritize visuals that have been certified. Certified visuals have the yellow checkmark next to them on AppSource, which means they have been certified by the Power BI team for performance and security.

Performance win: Limit interactivity

A final design element to watch out for is interactivity. The interactivity of your report is going to impact performance. The more interactive, the slower the report, as Power BI needs to process several requests before displaying the final result. By default, all visuals on a report page are set to interact with one another. Usually, this level of interactivity isn’t needed for end users, and results in several unnecessary queries in the back end. By reducing interactivity to only the scenarios needed by users, you can drastically improve report performance.

Network Issues

Of course, the final reason why your reports may be failing or loading slowly is your actual network. If this is an immediate issue (like your entire family arriving home and jumping onto the Wi-Fi at the exact same time), then you can wait a while and try again, or consider finding a location with better access. If you’re using a cloud-based Power BI report, you may also be experiencing network latency because of client machine resources or noisy neighbors.

Performance win: Make sure your report regions align

There are some network latency issues that you unfortunately don’t have much control over. Luckily, there are also several issues you can immediately address. Network latency affects the time it takes for requests to go back and forth from the Power BI service. Different tenants in Power BI are assigned to different regions. Ideally, you want your reports, tenant, and data sources in the same region. This reduces network latency by increasing the speed of data transfer and query execution.

Performance win: Configure your Power BI workloads

Network latency may be a result of unoptimized Power BI capacity settings. Optimize your capacity settings to your actual usage metrics, identifying when you should invest in advanced Power BI capacities such as Power BI Premium, Premium Per User (PPU), and Power BI Embedded. Overinvesting in Power BI can result in wasted expenses, but underinvesting can hamper the performance of key reports and dashboards.

Performance win: Manage your gateways

Whenever Power BI needs to access data that isn’t accessible over the Internet, it uses a gateway. Depending on your workload and gateway size, you need to evaluate whether you want to install an on-premises/enterprise data gateway, personal gateway, or VM-hosted infrastructure-as-a-service.

As a rule of thumb, if you’re working with larger databases, it’s better to go with an enterprise gateway rather than personal gateway. Enterprise gateways import no data into Power BI, making them more efficient for big data. You can also create a data cluster for any high-demand queries. This enables you to effectively load balance your gateway traffic and avoid single points of failure.

Finally, make sure you use separate gateways for Power BI service live connections and scheduled data refresh. If you’re using a single gateway for both, your live connection performance will suffer during the scheduled refresh.


Okay, so now you know how to diagnose the problem. You might be asking yourself: what comes next? If you want to learn more about specific steps you can take to improve your Power BI setup, check out our Power BI best practices guide. If optimizing your Power BI all on your own sounds daunting, get in touch with us at We'd be happy to help!

Up Next

Wednesday, October 27, 2021

Is Cloud Security Part of Cybersecurity?

The short answer: yes. Cloud security is a category of cybersecurity the way an apple is a category of fruit. All apples are fruit; not all fruit is an apple. The definition of cloud security is generally something along the lines of:

Cloud security is a branch of cybersecurity dedicated to securing cloud systems from both internal and external threats.

If you were looking for the short answer, there you have it. Cloud security is a part of cybersecurity. If you want to know why, or how the approach to cloud security differs, read on.

What does cloud security cover?

Cloud security covers a wide range of processes and technologies used to secure cloud systems. This includes:

    Identity management
    Network security
    Infrastructure-level security
    Application-level security
    Data security
    Governance and threat protection

Identity management refers to the process of authenticating and authorizing identities. It’s about verifying who can access your data and how. Popular forms of identity management include multi-factor authentication and directory services such as Azure Active Directory. According to Microsoft, “Many consider identity to be the primary perimeter for [cloud] security. This is a shift from the traditional focus on network security.”

Network security refers to the process of protecting your resources from unauthorized access via network traffic controls. With network security, your aim is to ensure you only allow legitimate traffic. In cloud security, your focus is on limiting connectivity between virtual networks when possible.

Infrastructure-level security refers to security measures taken to protect your entire cloud infrastructure, including policies, applications, technologies, and controls. One of the key areas here revolves around implementing antimalware software and virtual machine (VM) best practice.

Application-level security refers to the protective measures surrounding information exchanged in collaborative cloud environments, such Microsoft Teams, Office 365, or shared Power BI reports.

Data security refers to cloud admins’ ability to secure data through encryption or virtual private networks (VPNs). Encryption is one of the best ways for enterprises to secure their data while VPNSs are extremely popular among consumers and remote workers.

Governance and threat protection refer to how you identify and mitigate incoming threats. This covers one of the most important elements of cloud security: user awareness and training. To secure your cloud, you need to ensure your cloud users are up to date on the latest security protocols and org-wide policies. After all, user error accounts for up to 95% of cloud breaches.

What does cybersecurity cover?

Cybersecurity covers all activities focused on defending computers, servers, mobile devices, digital systems, networks, and data from malicious attacks. It’s a much larger area of digital security that includes:

    Data security
    Identity management
    Network security
    App and software security
    Data retention
    User education

You’ll notice that cloud security and cybersecurity tread a lot of the same ground. That said, cloud security is not just “the same thing as cybersecurity, but with cloud.” Cloud security is a unique area of cybersecurity that has grown exponentially in the last decade. There are some key differences that affect the way security admins approach their role.

What’s the difference between cloud security and cybersecurity?

Cloud security inherently requires buy-in from both the cloud vendor and the cloud buyer to ensure the system is secure. On the buyer end, this means defining your organization’s relationship to the cloud. Buyer considerations include whether you’re using a cloud-native or hybrid system and whether you want to invest in Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS), or Software-as-a-Service (SaaS). You also need to ensure you trust the vendor provisioning the cloud system. After all, because the storage is off-prem, it’s up to the vendor to secure their system.

Cloud systems are centralized and rapidly scalable in a way that requires you to regularly ensure you are implementing security best practices. All your data is in a single location, from harmless files like handover documentation to business-halting information about sales and projections, accessible 24/7 from anywhere. This means you need to invest in data security systems to avoid unauthorized or accidental access.

Cloud systems also scale more quickly than previous data storage systems. This emphasizes the importance of training not only for users but for your network admins. They need to ensure they are up-to-date on the latest security protocols. It’s why we at MAQ Software regularly reassess our security principles and are certified with ISO/IEC 27001 for information security management and ISO 27018 for cloud security.


According to Forbes, by 2025, half of all the world’s data will reside in a public cloud. Cloud security is here to stay, a permanent part of the cybersecurity landscape. If you want to learn more about how we can help you secure your cloud while still reducing your average monthly costs, check out our cloud migration and modernization hub.

Up Next

Monday, October 11, 2021

What is ‘Tech Intensity’ and Why Won’t the Tech Industry Stop Talking About it?

If you work in tech, you’ve probably come across the term ‘tech intensity.’ It’s been the hot-button topic of the last couple years, this decade’s version of “digital transformation.” But what is it? Buzz word or business term? Hype or strategy?

Starting at the beginning: where did ‘tech intensity’ come from?

The phrase “tech intensity” was first introduced by Microsoft CEO Satya Nadella in his keynote speech for Microsoft Ignite 2018. He based the concept off research by Dartmouth economist Diego Comin, who analyzed the role of rapid industrialization in determining a country’s economic success. According to Comin, speed is not the only important success factor. What’s key is the intensity of technological adoption: the ability to train people to use it.

Digital transformation has been the hallmark of a modernized company for decades. In his speech, Nadella suggested that digital is now the norm. To consider themselves innovators, companies need to go above and beyond the norm. They need to invest in tech intensity.

So, what is tech intensity?

Tech intensity is a company’s ability to adopt and fully integrate emerging technologies. Through tech intensity, companies build a strong digital foundation. In a blog post Nadella explains, “I think of tech intensity as being an equation: (tech adoption) ^ tech capabilities."

Tech intensity is your ability to adjust your tech capability to your adoption rate of new technology. That is: it’s not just about moving to the cloud or migrating to Power BI. Tech Intensity is about using the cloud and Power BI to build custom solutions for your company or your industry.

There are three main pillars to tech intensity:
1.    Integrating technology as quickly as possible
2.    Investing in digital intellectual property
3.    Achieving trust in technology

When it comes to integrating modern tools, you don’t need to recreate the wheel. Using open-source and low code/no code (LC/NC) solutions, it’s easier than ever to improve the way you run your business. Of course, for a globally distributed company implementing a new digital initiative, keeping everyone on track can be challenging. Here, the cloud is your greatest ally. By using cloud services such as Microsoft Azure, you centralize your insights, updates, and data. With a single source of truth, you leverage one of the most important elements of tech intensity: speed. 

Tech intensity goes beyond using new technologies. You need to adapt and create solutions where you identify gaps. At MAQ Software, our expertise lies in the Microsoft stack. We build solutions in Power BI, Azure Synapse, Power Apps, Dynamics 365, and more. However, we also created several custom tools we use to strengthen our solutions. For example, our Azure migrations are faster than standard cloud migrations because our cloud assessment tool shows a complete overview of all cloud assets within an environment in less than half an hour. 

Ultimately, it’s not enough to just be fast in adopting tech. With today’s security, a company’s relationship to technology must be built on trust. This means regularly updating your privacy, security, and compliance protocols, investing in cybersecurity, and creating solutions that are accessible across your organization. Small changes can make a big difference. For example, our implementations include data dictionaries to ensure business audiences understand the meaning of metrics.

Why is tech intensity important? The world is changing fast.

“Technology moves faster than our imaginations can keep up with. We invent one breakthrough technology today and then tomorrow’s inventors transform it into another we never imagined possible.” - The Atlantic 

Go back twenty-ish years and we don’t have Facebook, OneDrive, YouTube, iPhones, or USB flash drives. The world has changed astronomically in the last two decades. According to projections, this isn’t a trend that will slow down anytime soon. This year, the United States federal government spent over $5 billion on the civilian IT budget, which will lead to better technology, which will lead to faster changes. Tech intensity is how companies can not only keep up but stay ahead of the ever-accelerating curve.

Why is tech intensity important? Data is more important now than ever.

According to several projections, the world will have 50 billion connected devices by 2030 and 175 ZB of data by 2025. That is a volume of data that is almost impossible to comprehend. Picture every song, movie, book, poem, short story, painting, photograph, and YouTube video ever created in the history of humanity. That adds up (roughly) to 100 exabytes. If you saved 100 back-up copies of everything humanity has ever created, you’d still only have about 1 ZB worth. 

Tech intensity enables companies to use and analyze the previously unthinkable volumes of data they work with every day. Usually, size and speed are opposites. Big data means slower processing, collection, and visualization times because, well, there’s just more to work with, right? Not anymore. This is where tech intensity really shines. Of those 175 ZB, Forbes estimates that almost half will reside in cloud environments. The technology to respond to big data is there; it’s just a matter of integrating. According to digital analyst and business strategist Brian Solis, “We are looking at a future in which companies will indulge in digital Darwinism, using IoT, AI and machine learning to rapidly evolve in a way we’ve never seen before."

Why is tech intensity important? A changing world needs a changing mindset.

Tech intensity is not just about the tech. It’s about a mindset shift towards adoption and innovation. “Simply put, technology intensity is a critical part of business strategy today.” Nadella explains, “In my experience, high-performance companies invest the most in digital capabilities and skillsets.” 

Tech intensity means democratizing tools, technologies, and business data across your organization. There is a delicate balance between ensuring key data is available and ensuring only authorized people have access to it. When you achieve that balance, you enable team members to build their own solutions to problems they’ve identified, and make critical business decisions using real-time data. Enabling and supporting democratization efforts ensures you can adopt tech quickly and build your custom solution portfolio. Tech intensity begets tech intensity. 

On the process side of things, companies that want to invest in tech intensity also need to invest in Agile. A modern mindset needs a modern process. One of the challenges that comes with the speed of tech intensity is a disconnect between company initiatives and individual skillsets. However, if you’ve done the work to refine your process, you can ensure that no one gets left behind. 

For example, when we help large-enterprise companies migrate to Power BI, it’s never just about the technical process of moving from one system to another. Yes, there are many technical parts of migration, from analyzing the existing data architecture to optimizing DAX. But one of the most important parts of our custom migration process is our Center of Excellence trainings. During our CoEs, we train team members in Power BI capabilities and best practices, ensuring they are familiar and comfortable with the new platform. These trainings speed up adoption time frames from a matter of years to a matter of months. After all, tech intensity is not a mindset reserved exclusively for leadership; it needs buy-in from every person at the company.

Why is tech intensity important? Covid-19.

If there’s anything the business world has learned from the last year and a half, it’s that systemic change that might have once seemed impossible can occur in the blink of an eye. For some, the transition to work-from-home took place over a weekend. The key differentiator between the businesses that struggled and businesses that flourished? Tech intensity. 

“While the pandemic has taught us that no business is 100 percent resilient, those fortified by digital technology are more resilient, more capable of transforming when faced with these secular structural changes in the marketplace. We call this tech intensity, and every organization and every industry will increasingly need to embrace it in order to be successful and grow.” - Satya Nadella, Microsoft Inspire 2020 keynote


Want to invest in tech intensity for your organization and not sure where to get started? Check out our consulting offers in a wide range of technologies, from Power Platform to Dynamics 365, or reach out to

Up Next

Tuesday, May 25, 2021

Millions of Arizona Citizens Receive Benefits With the Help of an AI-powered Chatbot

Key Challenges

   Improve Program Service Evaluator training
   Enable Program Service Evaluators to obtain policy information without searching the entire policy manual
   Deliver conversational responses to Program Service Evaluators

Policies Prompt a Need for More Efficient Training

Our client, the Arizona Department of Economic Security (DES), needed to improve its Program Service Evaluator (PSE) training. PSEs are responsible for administering benefits and guiding applicants through the application process. During the application process, PSEs refer to an online policy manual of various state benefit programs. To ensure that qualified Arizona residents receive benefits, the policy manual includes specific guidelines and procedures. PSEs search the manual using keywords. PSEs then communicate with benefit recipients.

PSEs can search the policy manual using keywords, but the search terms may not match the specific language of the manual. To find information, PSEs regularly contact experienced coworkers. Because PSEs often need help, senior members of the DES policy team realized they could save time by providing responses to common questions. The policy team asked us to use advances in artificial intelligence to propose solutions that would save time for their senior staff members. After a detailed analysis, we used an innovative solution using Microsoft Azure Cognitive Services with a chatbot interface.

A chatbot offered several advantages over the PSEs’ previous methods of gathering information. A chatbot would reduce the time commitments of senior employees. A chatbot could respond to common questions using replies. A chatbot would also allow PSEs to obtain information from the DES policy manual without using the manual's search function. PSEs would be able to ask the chatbot questions in everyday language. Then, the chatbot would return information validated from the manual. A chatbot that understood the contents of the policy manual would reduce the amount of time spent. Ultimately, the chatbot would enable the PSEs to more efficiently evaluate benefits applications.

Incremental Improvements with Agile Approach

We divided the chatbot development into four stages: Preview, MVP, MVP+, and Pilot. (See Figure 1). We released the first preview build of the chatbot within three weeks of starting the project. The initial build allowed us to get early feedback.

Figure 1: Project Stages

The Preview build of the chatbot responded to PSE questions with a knowledge base of stored questions and responses. Because of our earlier testing, we knew we still needed to refine our chatbot.

The first challenge with the Preview build was the build didn’t adequately address the size and details of the policy manual. The Preview build’s knowledge base covered 500 of the most common PSE questions and responses. Still, the knowledge base did not contain enough information to address the intricacies of the manual. During the Preview stage, PSEs frequently asked common questions the chatbot was unable to answer. The chatbot often returned answers unrelated to the PSEs’ queries.

A second challenge was the initial build's search function did not meet client requirements. The old policy manual search engine prioritized the frequency of typed keywords. Our chatbot's search function prioritized question stems. The result was experienced PSEs did not find our search function intuitive. To improve the chatbot's search function, the chatbot needed to return a field of results when asked a single keyword. A specific result needed to appear when multiple keywords were used.

The challenges we encountered in the Preview stage defined the MVP build. Our chatbot needed to mimic the behavior of modern search engines. Our chatbot also needed to provide conversational responses to questions phrased in natural language. Finally, our chatbot's knowledge base also needed to grow. We expanded the chatbot's knowledge base from 500 question and response pairs to 5,000.

Producing Refined Results

During the MVP phase, we automated the generation of questions and responses when the chatbot crawled policy content. Whenever content was added, revised, or deleted from the manual, the chatbot automatically crawled the manual's pages and updated the knowledge base. The new build even allowed users to narrow the search categories to further refine results.

Results were further refined through continuous user feedback. If users struggled to find the information they needed, a question and response pair was automatically generated with help from the policy team. The automatic generation of questions and responses offered a substantial advantage over the previous build. Earlier, the addition of question and response pairs was unstructured. Providing a structured approach for question and response pairs also significantly improved the speed at which the bot learned.

We conducted weekly review meetings and progressively increased the audience size (Figure 1). In review meetings, we acknowledged specific chatbot queries and identified mismatched keywords. As we addressed concerns raised by PSEs, they felt ownership for the outcome. PSEs and supervisors then became champions for the chatbot. Through our extensive training, the chatbot learned quickly. Eventually, the chatbot returned results with greater than 90 percent accuracy.

Considering Users First

The chatbot is currently used by over 1,800 PSEs with varying degrees of expertise. The PSEs access the chatbot through a web interface and Skype for Business. Administrators can view the question and response database, manually edit the questions and responses if needed, and manually trigger the crawl function if the database needs updated.

We designed a web interface featuring a welcoming, friendly avatar named Sean. The interface provides options to track case numbers, resize the bot, and export conversations. When users type an ambiguous question, the chatbot offers multiple possible responses (with references).

PSEs can also use our chatbot via Skype for Business. Skype interactivity posed significant challenges, as the interface had to be entirely text-based. We created intuitive menu options that users selected via number input. The completed Skype interface possessed all the functionality of the web interface.

We also created an easy-to-use admin portal. The admin portal allows users to customize chatbot responses, manually trigger policy database crawls, track case numbers, and view response metrics.

The chatbot interface and admin portal resulted in a user-friendly solution. PSEs unfamiliar with the implementation can interact with it, understand it, and use it proficiently within minutes. As the DES project director observed, the chatbot integrated seamlessly into the PSEs’ workflow.

Going Live: Distributing Benefits with AI-driven Technology

The DES chatbot has increased evaluation efficiency for over 1,800 PSEs and improved processing time for millions of Arizona benefits recipients. The chatbot provides PSEs with speedy responses, successfully answering hundreds of queries per day.

Reflecting, our team lead recalled four significant factors that differentiated the DES chatbot from others. First, the policy manual was large and dense. Words and sentences in the manual resembled legal statutes. The bot simplifies references to the manual with results that are 90+ percent accurate. Second, the project is unique when compared to other chatbots because our chatbot auto-trains from site content. We enhanced the content further through user feedback loop training. Third, the intuitive user interface offers multiple responses to ambiguous questions. The availability of multiple responses drastically reduces the number of interactions required to find the desired result. Finally, the incremental review cycle allowed us to tailor the chatbot to client requirements and drove user acceptance and adoption.

Feedback from DES has been overwhelmingly positive. DES Chief Information Officer Sanjiv Rastogi is optimistic. He anticipates the chatbot’s role will expand to suit the department’s future needs: “MAQ Software helped us decide on and implement a solution built on Azure with cognitive services, which provides us the grow-as-you-go infrastructure, platform, SaaS, and AI integration DES needs. The level of confidence we have in this solution allows us to build, not just for today, but as an evergreen platform that will bring DES into the future."

Monday, December 21, 2020

Azure Services Product Comparison

With over 50 Azure services out there, deciding which service is right for your project can be challenging. Weighing the pros and cons of each option for numerous business requirements is a recipe for decision paralysis. When it comes to optimizing your Azure architecture, picking the right tool for the job is key.

Over the last decade, MAQ Software has migrated hundreds of clients to Azure. Along the way, we’ve picked up Azure tips and tricks that enable us to develop systems that are faster, more reliable, and operate at a lower cost. In this article, we’ll explore the differences between Azure services so you can pick the one that’s right for you.

Table of Contents

Which Azure cloud storage service should I use?

Azure Data Lake Storage (ADLS) Gen 1 vs. ADLS Gen 2 vs. Blob Storage

When to use Blob storage instead of ADLS Gen 2:
When your project involves minimal transactions, use Blob storage to minimize infrastructure costs. For example: production backups.

When to use ADLS Gen 2 instead of Blob storage:
When your project has multiple developers or involves analytics, and you need to securely share data with external users, use ADLS to leverage Azure Active Directory authentication. This prevents unauthorized users from accessing sensitive data. For example: global sales dashboards.

When to use ADLS Gen 1 instead of ADLS Gen 2:
When your project executes U-SQL within Azure Data Lake Analytics (ADLA) on top of storage services, use ADLS Gen 1. This is useful for projects that need analytics and storage performed from a single platform. For example: low-budget implementations.

When to use geo-replication: Geo-replication is available on all Azure cloud storage services. As a rule of thumb, avoid geo-replicating the development environment to keep infrastructure costs down. Only implement geo-replication for production.

Which Blob storage access tier to use: Picking the optimal access tier is important to achieve your desired performance at minimal storage costs:

Tier  Key differentiator  When to use it 
Hot Optimized for frequent access to objects in the storage account  For projects that require daily refresh and frequent data transactions

Example: cloud data with daily refresh
Cool Optimized for storing large volumes of data that is infrequently accessed and stored for at least 30 days For projects that require monthly fresh with limited transactions

Example: monthly snapshots
Archive Optimized for storing large volumes of data that is infrequently accessed and stored for at least 180 days  For projects that require static data, snapshots, and/or yearly refresh storage with almost no transactions

Example: yearly snapshots
Table 1: Blob Storage Access Tier Comparison

Which Azure cloud processing service should I use?

Azure Databricks vs. Azure Synapse Analytics

When to use Azure Databricks:
When your project has multiple developers or involves analytics, and you need to securely share data with external users, use ADLS to leverage Azure Active Directory authentication. This prevents unauthorized users from accessing sensitive data. For example: global sales dashboards.

    Deep learning models: Azure Databricks reduces ML execution time by optimizing code and using some of the most popular libraries (e.g., TensorFlow, PyTorch, Keras) and GPU-enabled clusters.

    Real-time transformations: Databricks Runtime supports Spark's structured streaming and autoloader functionality, meaning it can process stream data such as Twitter feeds.

When to use Azure Synapse Analytics:
When your project uses SQL analyses and data warehousing, or reporting and self-service BI. If you need to process large volumes of data without an ML model, use Azure Synapse Analytics:

    SQL analyses and data warehousing: Synapse is a developer-friendly environment, supporting full relational data models and stored procedures, and providing full standard T-SQL features. When migrating data from on-premises to the cloud, use Synapse for a seamless transition.

    Reporting and self-service BI: Power BI is integrated directly into the Synapse studio, which reduces data latency.

Which Azure Databricks pricing tier to pick: Choosing the right pricing tier is important to reduce infrastructure costs while achieving the desired performance:

Pricing Tier  Key differentiator  When to use it 
Premium The premium tier supports role-based access to notebooks, clusters, and jobs For projects that involve mulitple stakeholders in a shared development environment

For example: one platform accessed by multiple vendors
Standard The standard tier costs ~20% less than the premium tier For projects that involve a single stakeholder and limited development teams

Example: one platform run and developed by a single vendor

 Table 2: Databricks Pricing Tier Comparison

Which Azure Databricks workload to use: Choosing the right workload type is important to ensure you achieve the desired performance at a reduced cost:

Workload  Key differentiator  When to use it 
Data Analytics Data Analytics is more flexible as it supports interactive clusters and an analytics ecosystem Suited for dev environments, which require higher collaboration and multi-user sharing
Data Engineering Data Engineering costs ~30% less than Data Analytics Suited for UAT/prod environments, which generally require less collaboration and can be run by a single developer

 Table 3: Databricks Workload Comparison

How to choose Databricks cluster size: For development purposes, start with a smaller Databricks cluster (i.e., general purpose) and enable auto scaling to optimize costs; based on specific needs, you can opt for higher clusters.

Which Azure data expose service should I use?

Azure SQL database vs. Azure Synapse (formerly Azure Data Warehouse)

When to use Azure Synapse:
When projects deal with large volumes of data (>1 terabyte) and small number of users, or use OLAP data. For example: global sales data.

When to use Azure SQL database:
When projects work with real-time data (max 1 terabyte) and many users, or use OLTP data (application database). For example: ATM transaction.

Still have questions? Feel free to reach out:

Do you want to know more about our data management best practices, projects, and initiatives? Visit our Data Management Hub to learn more.