Skip to content

On-premises Data Solutions to Snowflake on Azure

This “end 2 end”, and step by step “quick” Demo should show you:

  • how easy
  • how fast
  • without writing any line of code
  • how secure

it can be to connect an existing on-premises data source to a modern Cloud Data Warehouse Solution such as Snowflake is. On-premises Data Solutions to Snowflake on Azure. Furthermore it demonstrate that Snowflake on Azure is a great combination. Using Azure Data Factory, Azure Functions and Microsoft Power BI among other well known connected Services in the Azure Cloud Ecosystem, including Single Sign on with Azure Active directory. It shows also that it is possible to build a Hyper Scale Enterprise Data Analytics Solution with a Dual Vendor Strategy.

Azure Data Factory can perform both, ELT and ETL Tasks. Currently there is no native Snowflake Connector public available in ADF V2. However, using the Snowflake .net Driver https://docs.snowflake.net/manuals/user-guide/dotnet-driver.html several options for workloads with Azure Functions and Custom Activities together with Azure Data Factory Pipelines are possible.

Scenarios are depending on many factors and circumstances. In our case, the sources are mostly relational database management systems. For several reasons we do not want to extract data from our source system to a csv file. And then transfer the CSV to a cloud blob storage and load them into Snowflake with native connectors. However, this would be also a very valid and common scenario.

We want to go for a scenario were we have directly linked both databases together and use Azure Data Factory for loading the data directly from the source Tables into Snowflake Tables. After loading the data we want to use Microsoft Power BI to analyze the data with a Live connect to that Analytics Model. This has the advantage that we do not need to load the data out of Snowflake again, it is just direct connect to the data warehouse.

Used Tools and Services in that Demo:

  • SQL Server 2017 Database on-premises
  • Azure Runtime Integration on-premises
  • Azure Data Factory V2 Pipeline (Integration Runtime configured)
  • Snowflake Data warehouse
  • Power BI

Background: Architecture on Azure: How we build it:

https://www.snowflake.com/blog/how-we-built-snowflake-on-azure/

Install Snowflake ODBC Driver

  1. https://sfc-repo.snowflakecomputing.com/odbc/win64/2.19.0/index.html

Open ODBC Data Sources (64-bit) and Create  a New Data Source

  • Enter the Details into the Snowflake Connection Dialog

Install and Configure Azure Integration Runtime

  • Create and Configure a Integration Runtime in an Azure Data Factory V2
  • Install the Azure Integration runtime on-premises
  • Register the Runtime and enter Authentication Key from the ADF V2 Runtime created earlier

Create a New ODBC Linked Service in Azure Data Factory

  • Configure the ODBC Linked Service by entering the Data Source Name created earlier, User Name and Password
  • Test the Connection from Azure Data Factory

Create a Table in Snowflake

I do not care about data types in this quick demo.

Grant proper Access Rights in case, for this Demo we do it quick which is normally not recommended:

Copy Data Wizard from Azure Data Factory V2

  • Source SQL Server 2017 on Premises

Create a New Linked Service to SQL Server on-Premises. Any linked Service which supports Azure integration runtime is available here, for example, Oracle, Sap, MySQL, Teradata and many more. All you can connect easily to Snowflake with a direct load through Azure Data Factory.

Enter the Server Name which will be reached through my previously configured Azure Integration runtime and get authenticated. Test the connection.

Select now the recently configured on premises Data Source.

Select your data tables from your on-premises Data Source

Select as Destination the Snowflake linked Service earlier created

Map the selected tables

Map Columns between the tables

Settings

Summary

Let the Pipeline Run

Connect with Power BI Direct Query

Conclusion

  • Connect fast and easy to almost any on-premises data source. Without writing any line of code,  zero coding.
  • While the Warehouse size seems not have to much impact on ETL/ ELT workloads on Reporting Workloads it has. This means the concept of using different Warehouses sizes for different workloads or user groups does absolutely make sense.
  • Currently there is no native Snowflake Connector in ADF V2 available but there is a .net driver which can be used to create your own
  • It shows that Snowflake on Azure is a great combination. Using Azure Data Factory, Azure Functions, Microsoft Power BI among other well known connected Services in the Azure Cloud Ecosystem, including Single Sign on with Azure Active directory. If shows also that it is possible to build a Hyper Scale Enterprise Data Analytics Solution with a Dual Vendor Strategy.
  • Larger amounts of data could take longer to transfer, depending on many factors.

Single Sing on

https://docs.microsoft.com/en-us/azure/active-directory/saas-apps/snowflake-tutorial

Read more

Views: 523