Dagster & dbt (Component)
The dagster-dbt library provides a DbtProjectComponent which can be used to easily represent dbt models as assets in Dagster. Dagster assets understand dbt at the level of individual dbt models. This means that you can:
- Use Dagster's UI or APIs to run subsets of your dbt models, seeds, and snapshots.
- Track failures, logs, and run history for individual dbt models, seeds, and snapshots.
- Define dependencies between individual dbt models and other data assets. For example, put dbt models after the Fivetran-ingested table that they read from, or put a machine learning after the dbt models that it's trained from.
DbtProjectComponent is a state-backed component, which compiles and caches your dbt project's manifest. For information on managing component state, see Configuring state-backed components.
Dagster supports dbt Fusion as of the 1.11.5 release. Dagster will automatically detect which engine you have installed. If you're currently using core, to migrate uninstall dbt-core and install dbt Fusion. For more information please reference the dbt docs.
This feature is still in preview pending dbt Fusion GA.
1. Prepare a Dagster project
To begin, you'll need a Dagster project. You can use an existing components-ready project or create a new one:
create-dagster project my-project && cd my-project
Activate the project virtual environment:
source .venv/bin/activate
Then, add the dagster-dbt library to the project, along with a duckdb adapter:
- uv
- pip
uv add dagster-dbt dbt-duckdb
pip install dagster-dbt dbt-duckdb
2. Set up a dbt project
- Colocate with Dagster
- External Git repository
For this tutorial, we'll use the jaffle shop dbt project as an example. Clone it into your project:
git clone --depth=1 https://github.com/dbt-labs/jaffle_shop.git dbt && rm -rf dbt/.git
We will create a profiles.yml file in the dbt directory to configure the project to use DuckDB:
jaffle_shop:
target: dev
outputs:
dev:
type: duckdb
path: ~/tutorial.duckdb
threads: 24
If your dbt project lives in a separate Git repository, you don't need to clone it locally. For this tutorial, we'll use the Jaffle Platform example repository, which already has a profiles.yml configured.
When using an external Git repository, Dagster manages the project as part of component state. For details on how state is managed, see Configuring state-backed components.
3. Scaffold a dbt component definition
- Colocate with Dagster
- External Git Repository
Now that you have a Dagster project with a dbt project, you can scaffold a dbt component definition. You'll need to provide the path to your dbt project:
dg scaffold defs dagster_dbt.DbtProjectComponent dbt_ingest \
--project-path "dbt"
Creating defs at /.../my-project/src/my_project/defs/dbt_ingest.
The dg scaffold defs call will generate a defs.yaml file in your project structure:
tree src/my_project
src/my_project
├── __init__.py
├── definitions.py
└── defs
├── __init__.py
└── dbt_ingest
└── defs.yaml
3 directories, 4 files
In its scaffolded form, the defs.yaml file contains the configuration for your dbt project:
type: dagster_dbt.DbtProjectComponent
attributes:
project: '{{ context.project_root }}/dbt'
Now that you have a Dagster project, you can scaffold a dbt component definition that points to an external Git repository. You'll need to provide the Git URL and the path to the dbt project within the repository:
dg scaffold defs dagster_dbt.DbtProjectComponent dbt_ingest \
--git-url "https://github.com/dagster-io/jaffle-platform.git" \
--project-path "jdbt"
Creating defs at /.../my-project/src/my_project/defs/dbt_ingest.
The dg scaffold defs call will generate a defs.yaml file in your project structure:
tree src/my_project
src/my_project
├── __init__.py
├── definitions.py
└── defs
├── __init__.py
└── dbt_ingest
└── defs.yaml
3 directories, 4 files
In its scaffolded form, the defs.yaml file contains the configuration for your remote dbt project:
type: dagster_dbt.DbtProjectComponent
attributes:
project:
repo_url: https://github.com/dagster-io/jaffle-platform.git
repo_relative_path: jdbt
In some cases, you may need to provide an authentication token for private Git repositories. You can do this by adding the token field to your defs.yaml file:
type: dagster_dbt.DbtProjectComponent
attributes:
project:
repo_url: https://some-host.com/your-org/your-dbt-project.git
repo_relative_path: path/to/dbt
token: '{{ env.GIT_TOKEN }}'
For Github-based repositories, this is typically unnecessary, as your credentials will be available locally as well as in the Github Actions that require access to the repository.
This is sufficient to load your dbt models as assets. You can use dg list defs to see the asset representation:
dg list defs
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │
│ │ │ customers │ default │ stg_customers │ dbt │ This table has basic information about a │ │
│ │ │ │ │ stg_orders │ duckdb │ customer, as well as some derived facts based │ │
│ │ │ │ │ stg_payments │ │ on a custome… │ │
│ │ ├───────────────┼─────────┼───────────────┼────────┼────────────────────────────────────────────────┤ │
│ │ │ orders │ default │ stg_orders │ dbt │ This table has basic information about orders, │ │
│ │ │ │ │ stg_payments │ duckdb │ as well as some derived facts based on │ │
│ │ │ │ │ │ │ payments │ │
│ │ │ │ │ │ │ │ │
│ │ │ │ │ │ │ ###… │ │
│ │ ├───────────────┼─────────┼───────────────┼────────┼────────────────────────────────────────────────┤ │
│ │ │ raw_customers │ default │ │ dbt │ dbt seed raw_customers │ │
│ │ │ │ │ │ duckdb │ │ │
│ │ │ │ │ │ │ #### Raw SQL: │ │
│ │ │ │ │ │ │ ```sql │ │
│ │ │ │ │ │ │ │ │
│ │ │ │ │ │ │ ``` │ │
│ │ ├───────────────┼─────────┼───────────────┼────────┼────────────────────────────────────────────────┤ │
│ │ │ raw_orders │ default │ │ dbt │ dbt seed raw_orders │ │
│ │ │ │ │ │ duckdb │ │ │
│ │ │ │ │ │ │ #### Raw SQL: │ │
│ │ │ │ │ │ │ ```sql │ │
│ │ │ │ │ │ │ │ │
│ │ │ │ │ │ │ ``` │ │
│ │ ├───────────────┼─────────┼───────────────┼────────┼────────────────────────────────────────────────┤ │
│ │ │ raw_payments │ default │ │ dbt │ dbt seed raw_payments │ │
│ │ │ │ │ │ duckdb │ │ │
│ │ │ │ │ │ │ #### Raw SQL: │ │
│ │ │ │ │ │ │ ```sql │ │
│ │ │ │ │ │ │ │ │
│ │ │ │ │ │ │ ``` │ │
│ │ ├───────────────┼─── ──────┼───────────────┼────────┼────────────────────────────────────────────────┤ │
│ │ │ stg_customers │ default │ raw_customers │ dbt │ dbt model stg_customers │ │
│ │ │ │ │ │ duckdb │ │ │
│ │ │ │ │ │ │ #### Raw SQL: │ │
│ │ │ │ │ │ │ ```sql │ │
│ │ │ │ │ │ │ with source as ( │ │
│ │ │ │ │ │ │ │ │
│ │ │ │ │ │ │ {#- │ │
│ │ │ │ │ │ │ Normally we… │ │
│ │ ├───────────────┼─────────┼───────────────┼────────┼────────────────────────────────────────────────┤ │
│ │ │ stg_orders │ default │ raw_orders │ dbt │ dbt model stg_orders │ │
│ │ │ │ │ │ duckdb │ │ │
│ │ │ │ │ │ │ #### Raw SQL: │ │
│ │ │ │ │ │ │ ```sql │ │
│ │ │ │ │ │ │ with source as ( │ │
│ │ │ │ │ │ │ │ │
│ │ │ │ │ │ │ {#- │ │
│ │ │ │ │ │ │ Normally we wo… │ │
│ │ ├───────────────┼─────────┼───────────────┼────────┼────────────────────────────────────────────────┤ │
│ │ │ stg_payments │ default │ raw_payments │ dbt │ dbt model stg_payments │ │
│ │ │ │ │ │ duckdb │ │ │
│ │ │ │ │ │ │ #### Raw SQL: │ │
│ │ │ │ │ │ │ ```sql │ │
│ │ │ │ │ │ │ with source as ( │ │
│ │ │ │ │ │ │ │ │
│ │ │ │ │ │ │ {#- │ │
│ │ │ │ │ │ │ Normally… │ │
│ │ └───────────────┴─────────┴───────────────┴────────┴────────────────────────────────────────────────┘ │
│ Asset Checks │ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳┳┓ │
│ │ ┃ Key ┃┃┃ │
│ │ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇╇┩ │