Assets
Software-defined assets are the primary building blocks in Dagster. They represent the underlying entities in our pipelines, such as database tables, machine learning models, or AI processes. Together, these assets form the data platform. In this step, you will define the initial assets that represent the data you will work with throughout this tutorial.
All Dagster objects, such as assets, are added to the Definitions
object that was created when you initialized your project.
1. Scaffold an assets file
When building assets, the first step is to scaffold an assets file with the dg scaffold
command:
dg scaffold defs dagster.asset assets.py
Creating a component at <YOUR PATH>/dagster-tutorial/src/dagster_tutorial/defs/assets.py.
This adds a file called assets.py
to the dagster-tutorial
module, which will contain your asset code. Using dg
to create the file ensures it is placed where Dagster can automatically discover it:
src
└── dagster_tutorial
└── defs
└── assets.py
2. Define the assets
Now that you have an assets file, you can define your asset code. You define an asset using the @dg.asset
decorator. Any function with this decorator will be treated as an asset and included in the Dagster asset graph.
You will create one asset for each of the three source files used in this tutorial:
- raw_customers.csv
- raw_orders.csv
- raw_payments.csv
import dagster as dg
@dg.asset
def customers() -> str:
return "https://raw.githubusercontent.com/dbt-labs/jaffle-shop-classic/refs/heads/main/seeds/raw_customers.csv"
@dg.asset
def orders() -> str:
return "https://raw.githubusercontent.com/dbt-labs/jaffle-shop-classic/refs/heads/main/seeds/raw_orders.csv"
@dg.asset
def payments() -> str:
return "https://raw.githubusercontent.com/dbt-labs/jaffle-shop-classic/refs/heads/main/seeds/raw_payments.csv"
For now, these assets will simply represent the underlying files. Next, we will discuss how Dagster knows to load them.
3. Check definitions
In Dagster, all defined objects (such as assets) need to be associated with a top-level Definitions
object in order to be deployed. When you first created your project with uvx create project
, a definitions.py
file was also created:
from pathlib import Path
from dagster import definitions, load_from_defs_folder
@definitions
def defs():
return load_from_defs_folder(project_root=Path(__file__).parent.parent.parent)
This Definitions
object loads the dagster-tutorial
module and automatically discovers all assets and other Dagster objects. There is no need to explicitly reference assets as they are created. However, it is good practice to check that the Definitions
object can be loaded without error as new Dagster objects are added.
You can use the dg check defs
command to ensure everything in your module loads correctly, and that your project is deployable:
dg check defs
All component YAML validated successfully.
All definitions loaded successfully.
This confirms there are no issues with any of the assets you have defined. As you develop your Dagster project, it is a good habit to run dg check
to ensure everything works as expected.
4. Materialize the assets
Now that your assets are configured and you have verified that the top-level Definitions
object is valid, you can view the asset catalog in the Dagster UI and reload the definitions:
-
In a browser, navigate to http://127.0.0.1:3000, or restart
dg dev
if it has been closed. -
Navigate to Assets.
-
Click Reload definitions.
You should now see three assets, one for each of the raw files (customers, orders, payments).
To materialize the assets:
-
Click Assets, then click View lineage to see all assets.
-
Click Materialize all.
You can also materialize assets from the command line with dg launch
. To materialize all assets, use the *
asset selection:
dg launch --assets "*"
To materialize specific assets, pass an asset selection specifying them:
dg launch --assets customers,orders,payments