Automation
There are several ways to automate assets in Dagster. Dagster supports both scheduled and event-driven pipelines. In this step, you will add a schedule object to automate the assets you have created.
Similar to resources, schedules exist within the Definitions
layer.
1. Scaffold a schedule definition
Cron-based schedules are common in data orchestration. They use time-based expressions to automatically trigger tasks at specified intervals, making them ideal for ETL pipelines that need to run consistently—such as hourly, daily, or monthly—to process and update data on a regular cadence. For the tutorial pipeline, you can assume that updated CSVs are uploaded at a specific time every day.
Use the dg scaffold defs
command to scaffold a new schedule object:
dg scaffold defs dagster.schedule schedules.py
Creating a component at <YOUR PATH>/dagster-tutorial/src/dagster_tutorial/defs/schedules.py.
This will add a generic schedules file to your project. The schedules.py
file is now part of the dagster-tutorial
module:
src
└── dagster_tutorial
└── defs
└── schedules.py
There is very little you need to change about the schedule that has been scaffolded. Schedules consist of a cron_schedule
and target
. By default the cron_schedule
will be set to @daily
and the target
will be set to *
. You can keep the target
as is, but change the cron_schedule
to something more specific. The code below updates the cron syntax to run at midnight, and updates the schedule name by renaming the function to tutorial_schedule
:
from typing import Union
import dagster as dg
@dg.schedule(cron_schedule="0 0 * * *", target="*")
def tutorial_schedule(
context: dg.ScheduleEvaluationContext,
) -> Union[dg.RunRequest, dg.SkipReason]:
return dg.SkipReason(
"Skipping. Change this to return a RunRequest to launch a run."
)
2. Enable automation
To enable automation:
-
Run
dg dev
(if it is not already running) and navigate to the Dagster UI at http://127.0.0.1:3000. -
Navigate to Assets.
-
Click Reload definitions.
-
Click Automation.
-
View your automation events to see if anything is ready to run.