Skip to main content

Autoloading existing Dagster definitions

warning

This feature is considered in a preview stage and is under active development. It can change significantly, or be removed completely. It is not considered ready for production use.

note

This guide covers using existing Dagster definitions with a dg-compatible project. To convert an existing project to use dg, see "Converting an existing project to use dg".

In projects that heavily use dg, you would typically keep all definitions in the defs/ directory. However, if you've converted an existing project to use dg, you may have definitions located in various other modules. This guide will show you how to move these existing definitions into the defs directory in a way that will allow them to be automatically loaded.

Example project structure

Let's walk through an example of migrating your existing definitions, with a project that has the following structure:

tree
.
├── README.md
├── my_existing_project
│   ├── __init__.py
│   ├── analytics
│   │   ├── __init__.py
│   │   ├── assets.py
│   │   └── jobs.py
│   ├── definitions.py
│   ├── defs
│   │   └── __init__.py
│   └── elt
│   ├── __init__.py
│   ├── assets.py
│   └── jobs.py
└── pyproject.toml

5 directories, 11 files

At the top level, we load definitions from various modules:

my_existing_project/definitions.py

import dagster_components as dg_components

import dagster as dg
import my_existing_project.defs
from my_existing_project.analytics import assets as analytics_assets
from my_existing_project.analytics.jobs import (
regenerate_analytics_hourly_schedule,
regenerate_analytics_job,
)
from my_existing_project.elt import assets as elt_assets
from my_existing_project.elt.jobs import sync_tables_daily_schedule, sync_tables_job

defs = dg.Definitions.merge(
dg.Definitions(
assets=dg.load_assets_from_modules([elt_assets, analytics_assets]),
jobs=[sync_tables_job, regenerate_analytics_job],
schedules=[sync_tables_daily_schedule, regenerate_analytics_hourly_schedule],
),
dg_components.load_defs(my_existing_project.defs),
)

Each of these modules contains a variety of Dagster definitions, including assets, jobs, and schedules.

Let's migrate the elt module to a component.

Move definitions to defs

We'll start by moving the top-level elt module into defs/elt:

mv my_existing_project/elt/* my_existing_project/defs/elt

Now that our definitions are in the defs directory, we can update the root definitions.py file to no longer explicitly load the elt module's Definitions:

my_existing_project/definitions.py
import dagster_components as dg_components
import my_existing_project.defs
from my_existing_project.analytics import assets as analytics_assets
from my_existing_project.analytics.jobs import (
regenerate_analytics_hourly_schedule,
regenerate_analytics_job,
)

import dagster as dg

defs = dg.Definitions.merge(
dg.Definitions(
assets=dg.load_assets_from_modules([analytics_assets]),
jobs=[regenerate_analytics_job],
schedules=[regenerate_analytics_hourly_schedule],
),
dg_components.load_defs(my_existing_project.defs),
)

Our project structure now looks like this:

tree
.
├── README.md
├── my_existing_project
│   ├── __init__.py
│   ├── analytics
│   │   ├── __init__.py
│   │   ├── assets.py
│   │   └── jobs.py
│   ├── definitions.py
│   └── defs
│   ├── __init__.py
│   └── elt
│   ├── __init__.py
│   ├── assets.py
│   └── jobs.py
├── pyproject.toml
└── uv.lock

5 directories, 12 files

The load_defs command in our definitions.py file will automatically load any definitions found within the defs module. This means that all of our definitions are now automatically loaded, with no need to import them up into any top-level organization scheme.

We can repeat the same process for our other modules.

Fully migrated project structure

Once each of our definitions modules are migrated, our project is left with a standardized structure:

tree
.
├── README.md
├── my_existing_project
│   ├── __init__.py
│   ├── definitions.py
│   └── defs
│   ├── __init__.py
│   ├── analytics
│   │   ├── __init__.py
│   │   ├── assets.py
│   │   └── jobs.py
│   └── elt
│   ├── __init__.py
│   ├── assets.py
│   └── jobs.py
├── pyproject.toml
└── uv.lock

5 directories, 12 files

Our project root now only constructs definitions from the defs module:

my_existing_project/definitions.py
import dagster_components as dg_components
import my_existing_project.defs

defs = dg_components.load_defs(my_existing_project.defs)

We can run dg list defs to confirm that all of our definitions are being loaded correctly:

my_existing_project/definitions.py
dg list defs

Assets
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┓
┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━┩
│ customers_table │ default │ │ │ │
│ my_analytics_asset │ default │ │ │ │
│ orders_table │ default │ │ │ │
│ products_table │ default │ │ │ │
└────────────────────┴─────────┴──────┴───────┴─────────────┘

Jobs
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Name ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ regenerate_analytics_job_schedule │
│ sync_tables_job_schedule │
└───────────────────────────────────┘

Schedules
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Name ┃ Cron schedule ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ regenerate_analytics_job_schedule │ 0 * * * *
│ sync_tables_job_schedule │ 0 0 * * *
└───────────────────────────────────┴───────────────┘

Next steps