Airflow
Airflow is a platform to programmatically author, schedule and monitor workflows
Source: Airflow
Airflow can be used to:
- run tasks on a regular schedule
- run memory-intensive workloads
- run end-to-end processing workflows involving multiple steps and dependencies
- monitor the performance of workflows and identify issues
Important links
Airflow dev UI: for running and monitoring development and training workflows on the Airflow UI
Airflow prod UI: for running and monitoring production workflows on the Airflow UI
Airflow Repo: Github repo to store Airflow DAGs and roles
Airflow template for Python: Github template repository for creating a Python image to run an Airflow pipeline
Airflow template for R: Github template repository for creating an R image to run an Airflow pipeline
Airflow Pipeline Instructions: Step by step guide for creating an example Airflow pipeline and related resources
Support: contact the Data Engineering team on the #ask-data-engineering Slack channel