Converting an existing project
dg
and Dagster Components are under active development. You may encounter feature gaps, and the APIs may change. To report issues or give feedback, please join the #dg-components channel in the Dagster Community Slack.
Suppose we have an existing Dagster project. Our project defines a Python
package with a single Dagster asset. The asset is exposed in a top-level
Definitions
object in my_existing_project/definitions.py
. We'll consider
both a case where we have been using uv with pyproject.toml
and pip
with setup.py
.
- uv
- pip
tree
.
├── my_existing_project
│ ├── __init__.py
│ ├── assets.py
│ ├── definitions.py
│ └── py.typed
├── pyproject.toml
└── uv.lock
2 directories, 6 files
tree
.
├── my_existing_project
│ ├── __init__.py
│ ├── assets.py
│ ├── definitions.py
│ └── py.typed
└── setup.py
2 directories, 5 files
Before proceeding, we'll make sure we have an activated and up-to-date virtual
environment in the project root. Having the virtual environment located in the
project root is recommended (particularly when using uv
) but not required.
- uv
- pip
If you don't have a virtual environment yet, run:
uv sync
Then activate it:
source .venv/bin/activate
If you don't have a virtual environment yet, run:
python -m venv .venv
Now activate it:
source .venv/bin/activate
And install the project package as an editable install:
pip install --editable .
Install dependencies
Install the dg
command line tool into your project virtual environment.
- uv
- pip
uv add dagster-dg-cli
pip install dagster-dg-cli
Update project structure
Add dg
configuration
The dg
command recognizes Dagster projects through the presence of TOML
configuration. This may be either a pyproject.toml
file with a tool.dg
section or a dg.toml
file. Let's add this configuration:
- uv
- pip
Since our project already has a pyproject.toml
file, we can just add
the requisite tool.dg
section to the file:
...
[tool.dg]
directory_type = "project"
[tool.dg.project]
root_module = "my_existing_project"
code_location_target_module = "my_existing_project.definitions"
Since our sample project has a setup.py
and no pyproject.toml
,
we'll create a dg.toml
file:
directory_type = "project"
[project]
root_module = "my_existing_project"
code_location_target_module = "my_existing_project.definitions"
There are three settings:
directory_type = "project"
: This is howdg
identifies your package as a Dagster project. This is required.project.root_module = "my_existing_project"
: This points to the root module of your project. This is also required.project.code_location_target_module = "my_existing_project.definitions"
: This tellsdg
where to find the top-levelDefinitions
object in your project. This actually defaults to[root_module].definitions
, so it is not strictly necessary for us to set it here, but we are including this setting in order to be explicit--existing projects might have the top-levelDefinitions
object defined in a different module, in which case this setting is required.
Now that these settings are in place, you can interact with your project using dg
. If we run dg list defs
we can see the sole existing asset in our project:
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━┩ │
│ │ │ my_asset │ default │ │ │ │ │
│ │ └──────────┴─────────┴──────┴───────┴─────────────┘ │
└─────────┴──────────────────────────────────────── ─────────────┘
Add a dagster_dg_cli.registry_modules
entry point
We're not quite done adding configuration. dg
uses the Python entry
point API
to expose custom component types and other scaffoldable objects from user
projects. Our entry point declaration will specify a submodule as the location
where our project exposes registry modules. By convention, this submodule is
named <root_module>.lib
. In our case, it will be my_existing_project.lib
.
Let's create this submodule now:
mkdir my_existing_project/components && touch my_existing_project/components/__init__.py
We'll need to add a dagster_dg_cli.registry_modules
entry point to our project and then
reinstall the project package into our virtual environment. The reinstallation
step is crucial. Python entry points are registered at package installation
time, so if you simply add a new entry point to an existing editable-installed
package, it won't be picked up.
Entry points can be declared in either pyproject.toml
or setup.py
:
- uv
- pip
Since our package metadata is in pyproject.toml
, we'll add the entry
point declaration there:
...
[project.entry-points]
"dagster_dg_cli.registry_modules" = { my_existing_project = "my_existing_project.components"}
...
Then we'll reinstall the package. Note that uv sync
will not
reinstall our package, so we'll use uv pip install
instead:
uv pip install --editable .
Our package metadata is in setup.py
. While it is possible to add
entry point declarations to setup.py
directly, we want to be able to
read the entry point declaration from dg
, and there is no reliable
way to read setup.py
(since it is arbitrary Python code). So we'll
instead add the entry point to a new setup.cfg
, which can be used
alongside setup.py
. Create setup.cfg
with the following contents
(if your package has existing entry points declared in setup.py
, you'll
want to move their definitions to setup.cfg
as well):
[options.entry_points]
dagster_dg_cli.registry_modules =
my_existing_project = my_existing_project.components
Then we'll reinstall the package:
pip install --editable .
If we've done everything correctly, we should now be able to run dg list registry-modules
and see the module my_existing_project.components
, which we have registered as an entry point, listed in the output.
dg list registry-modules
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Module ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ dagster │
│ my_existing_project.components │
└────────────────────────────────┘
We can now scaffold a new component in our project and it will be
available to dg
commands. First create the component:
dg scaffold component Foo
Creating a Dagster component type at /.../my-existing-project/my_existing_project/components/foo.py.
Scaffolded files for Dagster component type at /.../my-existing-project/my_existing_project/components/foo.py.
Then run dg list components
to confirm that the new component is available:
dg list components
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Key ┃ Summary ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ dagster.DefinitionsComponent │ An arbitrary set of Dagster definitions. │