Orchestration

This article will explain how to configure orchestration and perform monitoring activities.

Managing data workflows and ensuring timely refreshes can be challenging without the right orchestration tools. FastLake simplifies this process by integrating seamlessly with Apache Airflow, offering robust orchestration capabilities tailored for modern Lakehouse environments.

Orchestration with Airflow

The platform leverages Airflow for orchestration, ensuring seamless data pipeline management. DAGs (Directed Acyclic Graphs) are automatically generated and deployed whenever changes are published in the Lakehouse. Users can initiate a DAG run via the user interface by clicking the "Run DAG" button or accessing the Airflow UI directly.

Airflow

Scheduled Refresh

Data refresh operations can be scheduled using flexible configurations, including full Cron expression support. This allows precise control over when and how refreshes are triggered.

Refresh modes

Cron Schedule

This mode uses user-defined Cron expressions to trigger data refreshes at specified intervals.

Cron schedule

Automatic Refresh

Automatic refresh mode intelligently triggers only the necessary branches of the DAG when changes are detected in source data or dataset materialization code. Smart algorithms minimize unnecessary refreshes and prevent conflicts. Enable this mode to let the platform handle refreshes automatically. For more details, refer to the respective documentation page.