Wednesday, November 10, 2021

Power BI performance factors: what impacts report performance?



There is little more frustrating than a slow-loading Power BI report. When you’re working with billions of rows and columns, it can feel like improving performance is impossible. At MAQ Software, we’ve worked with over 8,000 Power BI reports across a number of industries. In our experience, it is never impossible to improve your report performance. In fact, our goal for all report pages is to load within 8 seconds (at most).

So, how do we do it? By taking a structured approach. The first step is identifying the main areas that impact report performance. After all, to diagnose the problem, you need to know where to look. Typically, there are four major factors that affect Power BI performance: the data set, data sources, report design, and network issues.

The Data Set


Performance win: Reduce your data set size


The size and characteristics of your data set can drastically impact your final dashboard. You should ask yourself questions like what can you consolidate or eliminate? If you’re working with a lot of rows, do you need them all? What information is your business audience actually using?

Taking a good hard look at your data set size doesn’t mean you can’t work with big data sets – Power BI is absolutely designed to handle large volumes of real-time data. It’s about carefully identifying what you can keep and what needs to go. A few of the most common performance detractors within the data set include:

    Whitespace
    Null values
    High column cardinality (i.e., columns with values that are very uncommon or unique, such as user names or user IDs)

Performance win: Optimize your data set model


You’re also going to want to look at your data model. When it comes to optimizing report performance, reducing the size of your model offers the best possible return on investment. The smaller your model, the faster it will run in the report. While different data sets require different models, there are a couple best practices you should follow that provide quick wins for performance:

1.    Use star schema

Star schema is, by far, the best model to use in Power BI. In the star schema, dimension tables align with fact tables (giving it its eponymous star shape). Its alternative, the snowflake schema, uses subdivisions that represent an additional join in your queries. In Power BI, joins translate to slow loading. The fewer you have, the better.

2.    Turn off Auto Date/Time

Auto Date/Time enables users to easily drill down into calendar time periods without having to implement a date dimension. However, this means that for every date column in your data, there is a hidden date/time table in your model. In large data sets, this adds up; your data model could end up massive and sluggish.

3.    Summarize metrics (where possible)

Your raw data may pull information for daily, or even hourly sales, but do your end users actually need reporting at this level? If they only need to know overall monthly sales, you can significantly reduce your model size just by summarizing data by month rather than hour.

4.    Select the right dataset model for your data

In general, the Import model offers the fastest performance thanks to its use of in-memory querying. The Import model imports data to a stored disk, so its query results are extremely fast (as long as the data is fully loaded into the Power BI memory). However, data models are not one-size-fits-all. If you need to work with data volumes that are too large to load complete into the model, or need to deliver real-time data, you should consider using DirectQuery.

Performance win: Optimize your measures


The efficiency of your DAX directly impacts the amount of time it takes a query to render data in a chart. Your best bet? Follow DAX best practices. Some quick wins you can implement today include:

    Reducing the number of operations within your DAX

    Before: Max Value:=IF(A>B, 1.1 * A, 1.1 * B)
    After: Max Value:=1.1 * MAX(A,B)

    Avoiding both-directional relationships in the data model (where both tables in a relationship cross-filter each other)
    Moving row-level logic to Power Query (using M to calculate instead of DAX)
    Avoiding floating point data types
    Using Divide instead of \
   Creating a flag in the table instead of having multiple values in a single IN clause (we improved performance by ~2 seconds with this change)

    Before: Measure 1:= IF (Status IN { "Open", "Closed", "In-Progress"},[Actual], [Target])
    After: Measure 1:=IF (Status = 1,[Actual], [Target])

Data Sources


Performance win: Consider the cloud


The type of data source you connect to your reports affects report performance. One of today’s hottest topics, especially with the advent of hybrid work, is the cloud. More and more businesses are relying on cloud reporting to share insights across the world. While each organization has to customize their system to fit their needs, we’ve seen some incredible performance wins through cloud-based reporting. In one scenario, we reduced a client’s data processing time from half an hour to two minutes by migrating them to the cloud.

Performance win: Track your actual needs instead of your assumed needs


Ask yourself:

    Are you using a Tabular model or cube and why?
    Do you have Geo replication enabled (and do you need to)?
    Are you using load balancer (and should you be)?
    Is the data configuration correct?

One of the most important things to consider is the end user’s experience. You don’t necessarily need everything to load fast. You just need to ensure users can quickly access the information they regularly rely upon.

For example: Power BI maintains a cache for dashboard tiles. Pulling data from the cache is faster and more reliable than querying the data source itself. If your users primarily need at-a-glance information, you can make your dashboards the user landing page and pin the most-used visuals. This way, you’ll deliver better user experience at a fraction of the performance cost.

Report Design


Performance win: Filter your data


We all know that report design impacts user experience, but it also has a noticeable effect on report performance. After all, the more data each visual needs to display, the slower the visual will load. Design-wise, there are a couple of big-ticket items to watch out for.

Avoid using unfiltered data. Usually, users don’t need every single row and column of every table every time they open a report. Use Top N filter to reduce the maximum number of items displayed in the table. This reduces the load on the report, improving performance.

You should also be careful when it comes to slicers. Slicers are a great way to help users navigate data, but they tank your report performance. This is because slicers always generate two queries: one to fetch data and one to fetch selection details. If you absolutely need to include slicers, use the Filter pane to evaluate which slicers are used most often, and implement only those.

Performance win: Limit your visuals


Using too many visuals in a single report turns report performance into a slog (and makes your reports difficult to read). Be mindful about which visuals you implement. In general, you should use the following Power BI performance guidelines:

    Maximum number of widgets: 8
    Maximum number of grids: 1
    Maximum number of tiles: 10

Not all visual types perform the same. Grids, for example, are a massive drain on resources, while cards are a much more efficient information delivery system. To optimize the performance of your reports, you should limit each report page to a maximum of 30 total points using the following scoring system:
Using too many visuals in a single report turns report performance into a slog (and makes your reports difficult to read). Be mindful about which visuals you implement. In general, you should use the following Power BI performance guidelines:

    Cards: 1 point each
    Gauges: 2 points each
    Charts: 3 points each
    Maps: 3 points each
    Grids: 5 points each

Some visuals are also more efficient than others. Out-of-the-box visuals are traditionally faster than custom ones, as they’ve been vetted and created by the Power BI team. However, custom visuals have their merits, especially if you’re looking for something niche. When using custom visuals, prioritize visuals that have been certified. Certified visuals have the yellow checkmark next to them on AppSource, which means they have been certified by the Power BI team for performance and security.

Performance win: Limit interactivity


A final design element to watch out for is interactivity. The interactivity of your report is going to impact performance. The more interactive, the slower the report, as Power BI needs to process several requests before displaying the final result. By default, all visuals on a report page are set to interact with one another. Usually, this level of interactivity isn’t needed for end users, and results in several unnecessary queries in the back end. By reducing interactivity to only the scenarios needed by users, you can drastically improve report performance.

Network Issues


Of course, the final reason why your reports may be failing or loading slowly is your actual network. If this is an immediate issue (like your entire family arriving home and jumping onto the Wi-Fi at the exact same time), then you can wait a while and try again, or consider finding a location with better access. If you’re using a cloud-based Power BI report, you may also be experiencing network latency because of client machine resources or noisy neighbors.

Performance win: Make sure your report regions align


There are some network latency issues that you unfortunately don’t have much control over. Luckily, there are also several issues you can immediately address. Network latency affects the time it takes for requests to go back and forth from the Power BI service. Different tenants in Power BI are assigned to different regions. Ideally, you want your reports, tenant, and data sources in the same region. This reduces network latency by increasing the speed of data transfer and query execution.

Performance win: Configure your Power BI workloads


Network latency may be a result of unoptimized Power BI capacity settings. Optimize your capacity settings to your actual usage metrics, identifying when you should invest in advanced Power BI capacities such as Power BI Premium, Premium Per User (PPU), and Power BI Embedded. Overinvesting in Power BI can result in wasted expenses, but underinvesting can hamper the performance of key reports and dashboards.

Performance win: Manage your gateways


Whenever Power BI needs to access data that isn’t accessible over the Internet, it uses a gateway. Depending on your workload and gateway size, you need to evaluate whether you want to install an on-premises/enterprise data gateway, personal gateway, or VM-hosted infrastructure-as-a-service.

As a rule of thumb, if you’re working with larger databases, it’s better to go with an enterprise gateway rather than personal gateway. Enterprise gateways import no data into Power BI, making them more efficient for big data. You can also create a data cluster for any high-demand queries. This enables you to effectively load balance your gateway traffic and avoid single points of failure.

Finally, make sure you use separate gateways for Power BI service live connections and scheduled data refresh. If you’re using a single gateway for both, your live connection performance will suffer during the scheduled refresh.

***

Okay, so now you know how to diagnose the problem. You might be asking yourself: what comes next? If you want to learn more about specific steps you can take to improve your Power BI setup, check out our Power BI best practices guide. If optimizing your Power BI all on your own sounds daunting, get in touch with us at Sales@MAQSoftware.com. We'd be happy to help!

Up Next