Resources
The assets you created in the previous step represent pointers to files in the cloud. This is a good first step in building out the asset graph, but it would be better if they represented tables in our database.
In this step, you will load that data into DuckDB, an analytical database, so your assets will instead be representations of DuckDB tables.
Since the same database will be used across all three assets, rather than adding the connection logic to each asset, you can use a resource to centralize the connection in a single object that can be shared across multiple Dagster objects.
Resources are Dagster objects much like assets, but they are not executed. Some Dagster entities in the Definitions
layer complement other objects without being able to be directly executed. Typically, resources are reusable objects that supply external context, such as database connections, API clients, or configuration settings. Because of this, a single resource can be shared across many different Dagster objects.
1. Define the DuckDB resource
First, install the dagster-duckdb
library:
- uv
- pip
uv add dagster-duckdb
pip install dagster-duckdb
Next, scaffold a resources file:
dg scaffold defs dagster.resources resources.py
Creating a component at <YOUR PATH>/dagster-tutorial/src/dagster_tutorial/defs/resources.py.
This adds a generic resources file to your project. The resources.py
file is now part of the dagster-tutorial
module:
src
└── dagster_tutorial
└── defs
└── resources.py
Within this file, you can define a DuckDBResource
that consolidates the database connection in one place, along with a resources
function decorated with the @dg.definitions
decorator. This function maps all resources to specific keys that can be used throughout the project:
from dagster_duckdb import DuckDBResource
import dagster as dg
database_resource = DuckDBResource(database="/tmp/jaffle_platform.duckdb")
@dg.definitions
def resources():
return dg.Definitions(
resources={
"duckdb": database_resource,
}
)
Here, the duckdb
key is set to the DuckDBResource
defined above. Any Dagster object that uses this resource key will use the underlying DuckDB connection.
2. Add the resource to the assets
With the resource defined, you can update the asset code. First, set the DuckDBResource
as a parameter in each asset, using the name duckdb
. This matches the key that was set when defining the resource, and allows it to be used inside the asset. Then, use the get_connection
method from the resource to connect to the database and execute the query to create the tables:
from dagster_duckdb import DuckDBResource
import dagster as dg
@dg.asset
def customers(duckdb: DuckDBResource):
url = "https://raw.githubusercontent.com/dbt-labs/jaffle-shop-classic/refs/heads/main/seeds/raw_customers.csv"
table_name = "customers"
with duckdb.get_connection() as conn:
conn.execute(
f"""
create or replace table {table_name} as (
select * from read_csv_auto('{url}')
)
"""
)
@dg.asset
def orders(duckdb: DuckDBResource):
url = "https://raw.githubusercontent.com/dbt-labs/jaffle-shop-classic/refs/heads/main/seeds/raw_orders.csv"
table_name = "orders"
with duckdb.get_connection() as conn:
conn.execute(
f"""
create or replace table {table_name} as (
select * from read_csv_auto('{url}')
)
"""
)
@dg.asset
def payments(duckdb: DuckDBResource):
url = "https://raw.githubusercontent.com/dbt-labs/jaffle-shop-classic/refs/heads/main/seeds/raw_payments.csv"
table_name = "payments"
with duckdb.get_connection() as conn:
conn.execute(
f"""
create or replace table {table_name} as (
select * from read_csv_auto('{url}')
)
"""
)
3. Check definitions
Run dg check
again to confirm that the assets and resources are configured correctly. If there is a mismatch between the key set in the resource and the key required by the asset, dg check
will fail.
dg check defs
All component YAML validated successfully.
All definitions loaded successfully.
4. View the resource
Back in the UI, your assets will not look different, but you can view the resource in the Definitions tab:
-
Click Deployment, then click "dagster-tutorial" to see your deployment.
-
Click Definitions.
-
Navigate to the "Resources" section to view all of your resources, then select "duckdb":
-
Click on "Uses" for the resource to see the three assets that depend on the resource: