Monday, December 21, 2020

How to Use Azure Services: Product Comparison


With over 50 Azure services out there, deciding which service is right for your project can be challenging. Weighing the pros and cons of each option for numerous business requirements is a recipe for decision paralysis. When it comes to optimizing your Azure architecture, picking the right tool for the job is key.

Over the last decade, MAQ Software has migrated hundreds of clients to Azure. Along the way, we’ve picked up Azure tips and tricks that enable us to develop systems that are faster, more reliable, and operate at a lower cost. In this article, we’ll explore the differences between Azure services so you can pick the one that’s right for you.

Which Azure cloud storage service should I use?

Azure Data Lake Storage (ADLS) Gen 1 vs. ADLS Gen 2 vs. Blob Storage


When to use Blob storage instead of ADLS Gen 2: When your project involves minimal transactions, use Blob storage to minimize infrastructure costs. For example: production backups.

When to use ADLS Gen 2 instead of Blob storage: When your project has multiple developers or involves analytics, and you need to securely share data with external users, use ADLS to leverage Azure Active Directory authentication. This prevents unauthorized users from accessing sensitive data. For example: global sales dashboards.

When to use ADLS Gen 1 instead of ADLS Gen 2: When your project executes U-SQL within Azure Data Lake Analytics (ADLA) on top of storage services, use ADLS Gen 1. This is useful for projects that need analytics and storage performed from a single platform. For example: low-budget implementations.

When to use geo-replication: Geo-replication is available on all Azure cloud storage services. As a rule of thumb, avoid geo-replicating the development environment to keep infrastructure costs down. Only implement geo-replication for production.

Which Blob storage access tier to use: Picking the optimal access tier is important to achieve your desired performance at minimal storage costs:

Tier  Key differentiator  When to use it 
Hot Optimized for frequent access to objects in the storage account  For projects that require daily refresh and frequent data transactions

Example: cloud data with daily refresh
Cool Optimized for storing large volumes of data that is infrequently accessed and stored for at least 30 days For projects that require monthly fresh with limited transactions

Example: monthly snapshots
Archive Optimized for storing large volumes of data that is infrequently accessed and stored for at least 180 days  For projects that require static data, snapshots, and/or yearly refresh storage with almost no transactions

Example: yearly snapshots
Table 1: Blob Storage Access Tier Comparison

Which Azure cloud processing service should I use?

Azure Databricks vs. Azure Synapse Analytics


When to use Azure Databricks: When your project has multiple developers or involves analytics, and you need to securely share data with external users, use ADLS to leverage Azure Active Directory authentication. This prevents unauthorized users from accessing sensitive data. For example: global sales dashboards.

    Deep learning models: Azure Databricks reduces ML execution time by optimizing code and using some of the most popular libraries (e.g., TensorFlow, PyTorch, Keras) and GPU-enabled clusters.

    Real-time transformations: Databricks Runtime supports Spark's structured streaming and autoloader functionality, meaning it can process stream data such as Twitter feeds.

When to use Azure Synapse Analytics: When your project uses SQL analyses and data warehousing, or reporting and self-service BI. If you need to process large volumes of data without an ML model, use Azure Synapse Analytics:

    SQL analyses and data warehousing: Synapse is a developer-friendly environment, supporting full relational data models and stored procedures, and providing full standard T-SQL features. When migrating data from on-premises to the cloud, use Synapse for a seamless transition.

    Reporting and self-service BI: Power BI is integrated directly into the Synapse studio, which reduces data latency.

Which Azure Databricks pricing tier to pick: Choosing the right pricing tier is important to reduce infrastructure costs while achieving the desired performance:

Pricing Tier  Key differentiator  When to use it 
Premium The premium tier supports role-based access to notebooks, clusters, and jobs For projects that involve mulitple stakeholders in a shared development environment

For example: one platform accessed by multiple vendors
Standard The standard tier costs ~20% less than the premium tier For projects that involve a single stakeholder and limited development teams

Example: one platform run and developed by a single vendor

 Table 2: Databricks Pricing Tier Comparison

Which Azure Databricks workload to use: Choosing the right workload type is important to ensure you achieve the desired performance at a reduced cost:

Workload  Key differentiator  When to use it 
Data Analytics Data Analytics is more flexible as it supports interactive clusters and an analytics ecosystem Suited for dev environments, which require higher collaboration and multi-user sharing
Data Engineering Data Engineering costs ~30% less than Data Analytics Suited for UAT/prod environments, which generally require less collaboration and can be run by a single developer

 Table 3: Databricks Workload Comparison

How to choose Databricks cluster size: For development purposes, start with a smaller Databricks cluster (i.e., general purpose) and enable auto scaling to optimize costs; based on specific needs, you can opt for higher clusters.

Which Azure data expose service should I use?

Azure SQL database vs. Azure Synapse (formerly Azure Data Warehouse)


When to use Azure Synapse: When projects deal with large volumes of data (>1 terabyte) and small number of users, or use OLAP data. For example: global sales data.

When to use Azure SQL database: When projects work with real-time data (max 1 terabyte) and many users, or use OLTP data (application database). For example: ATM transaction.


Still have questions? Feel free to reach out: Support@MAQSoftware.com

Do you want to know more about our data management best practices, projects, and initiatives? Visit our Data Management Hub to learn more.