In my first data project I used Github Workflows in combination with Python modules to Extract, Transform and Load multiple data types from multiple data sources.
When Microsoft announced the capability to run Apache Airflow DAGs (Directed Acyclic Graph) within Azure Data Factory I thought why not re-use the Python modules and see if I can get them to work in a Managed Airflow Integration Runtime within Azure Data Factory.
I am currently working on this Github project which is not finished yet. Once it is finished (enough) I will write a blog about the full project as well, but for now I wanted to do a quick write up about deploying the Managed Airflow IntegrationRuntime within Azure Data Factory.
For my ADF + Managed Airflow project I will setup the infrastructure via Infrastructure as Code. I am using Terraform, but I also want to provide the ARM and Bicep templates as samples.
With the Terraform Azurerm provider we can deploy Azure Data Factory, but you won’t find a resource (yet) for the Managed Airflow IntegrationRuntime. While going through the Azure documentation I did not find anything on how to deploy it using IaC.
So I used the Terraform AzApi provider, which can be used for Azure resources that are not yet supported in the AzureRM provider. You can think of resources or features that are in private or public preview.
Terraform example template
In the snippet below you see the full sample template that you can use to deploy an Azure Data Factory instance and also a Managed Airflow IntegrationRuntime.
The body is a JSON object that contains the request body used to create and update azure resource. What I did is manually create the Airflow instance in ADF and then exported the ARM template to check the various properties, which I have added to this sample template.
I have added the example to the Azure Terraform AzApi provider repository. Please contribute to the repository above if you have any updates!