Ingesting Snowflake data into Microsoft Fabric is supported as a 1st-party integration by Microsoft, primarily in two ways:

  1. As batch processes that can be scheduled, for example with Fabric Data Pipelines.
  2. In near real-time, using change data capture-backed Mirroring

In this post, we take a look at setting up Mirroring to replicate data from Snowflake into a Fabric Data Warehouse.

💡
To read a about using Data Pipelines in Fabric to ingest Snowflake data, check out this related post.

Mirroring Advantages

The primary advantages of the mirroring approach:

  1. Data is mirrored nearly in real-time, making Mirroring an ideal solution for solutions where up-to-the-minute data is needed to enable decisions.
  2. Relatively easy to set-up. Since Mirroring simply copies entire Snowflake tables (or optionally all tables in a Snowflake database), the setup is mostly point-and-click.

Mirroring Disadvantages

There are some scenarios where mirroring may not be the best data integration choice:

  1. Mirroring requires Snowflake compute CPU cycles to run and could lead to a higher Snowflake compute cost. This is expected, since the processes that detect, query and forward data from Snowflake require compute cycles.
  2. Mirroring isn't designed to incorporate transformation as would be part of an ETL/ELT pattern. Data is mirrored as-is, so if data transformation during loading is desired, a data pipeline may be a better approach.
💡
Note: Mirroring has little or no cost implications on the Microsoft Fabric side. Microsoft doesn't charge for Mirroring compute or storage (there is a limit to free storage that varies by SKU, but generally storage is intended to be without charge within reason).

Video Walk Through

I highly encourage watching the companion YouTube Video for this post. In 6 minutes, it covers the entire configuration and test of mirroring. However, a text version follows the embedded video link.

Enabling The Mirroring Feature

Before creating a Database Mirror, the Fabric Mirror feature needs to be enabled in the Fabric Admin Console.

Enable Mirroring

Create a Mirror

With the Mirroring feature enabled in the Admin console, we can create a new mirror, using the icons added to the Data Warehouse workload landing page.

Create a Mirror Database

Configure the Snowflake Connection

To replicate data from Snowflake, Fabric needs to use a connection which provides connection details and authentication.

In this example, I've used a connection created in a previous video, which I'll embed here in case you'd like to see how these connections are created in Fabric:

How to create a Snowflake connection from Fabric

Select Source Table(s)

After selecting the Snowflake connection to use, we can select either an entire database or specific tables from Snowflake to mirror to the Fabric Data Warehouse.

Select Mirrored Tables

Monitor the Mirror

Once the Mirror database is created, the Mirror runs in the background. As changes are made to any of the remote Snowflake tables selected for the mirror (or all tables, if the entire database is selected), net changes are forwarded to Fabric to keep the Fabric data copy in sync.

Monitoring the Snowflake Mirror

Inspect the Mirrored Tables

Once the mirrored tables are in the Fabric Data Warehouse, they can be queried as usual via the SQL Analytics Endpoint.

Inspecting Mirrored Tables

Mirroring Changes to Snowflake

In the preceding image, the highest reseller key is 702 - "Able Baker". If we return to Snowflake and insert a new row, we'll create a new row, in this case 703 - "Charlie Delta".

New row created in Snowflake

Then, returning to the Fabric Data Warehouse, we can see that the new row was replicated (mirrored) to Fabric almost immediately!

New row replicated to Fabric Data Warehouse Mirror

Summary

Snowflake Mirroring provides a way to replicate data in near real-time from a Snowflake data warehouse to a Fabric Data Warehouse.

The Mirrored data in Fabric is stored in Delta Parquet tables, and can be accessed by any Fabric technology, and can be queried via the mirrored data warehouse SQL Analytics endpoint.

Mirroring is included in all Fabric SKUs, and Microsoft doesn't charge for compute or storage (within reasonable limits) for mirrored data. However, mirroring may have an impact on Snowflake compute charges.

However, if near real-time updates in Fabric are desired, Mirroring is a straightforward and relatively easy to implement Fabric feature.