Components
Using Components
- @dagster.component_instance [source]
- preview
This API is currently in preview, and may have breaking changes in patch version releases. This API is not considered ready for production use.
Decorator for a function to be used to load an instance of a Component. This is used when instantiating components in python instead of via yaml.
class
dagster.ComponentLoadContext [source]Context object that provides environment and path information during component loading.
This context is automatically created and passed to component definitions when loading a project’s defs folder. Each Python module or folder in the defs directory receives a unique context instance that provides access to project structure, paths, and utilities for dynamic component instantiation.
The context enables components to:
- Access project and module path information
- Load other modules and definitions within the project
- Resolve relative imports and module names
- Access templating and resolution capabilities
Parameters:
- path – The filesystem path of the component currently being loaded. For a file:
/path/to/project/src/project/defs/my_component.py
For a directory:/path/to/project/src/project/defs/my_component/
- project_root – The root directory of the Dagster project, typically containing
pyproject.toml
orsetup.py
. Example:/path/to/project
- defs_module_path – The filesystem path to the root defs folder. Example:
/path/to/project/src/project/defs
- defs_module_name – The Python module name for the root defs folder, used for import resolution. Typically follows the pattern
"project_name.defs"
. Example:"my_project.defs"
- resolution_context – The resolution context used by the component templating system for parameter resolution and variable substitution.
- terminate_autoloading_on_keyword_files – Controls whether autoloading stops when encountering
definitions.py
orcomponent.py
files. Deprecated: This parameter will be removed after version 1.11.
Examples:
Using context in a component definition:
import dagster as dg
from pathlib import Path
@dg.definitions
def my_component_defs(context: dg.ComponentLoadContext):
# Load a Python module relative to the current component
shared_module = context.load_defs_relative_python_module(
Path("../shared/utilities.py")
)
# Get the module name for the current component
module_name = context.defs_relative_module_name(context.path)
# Create assets using context information
@dg.asset(name=f"{module_name}_processed_data")
def processed_data():
return shared_module.process_data()
return dg.Definitions(assets=[processed_data])Loading definitions from another component:
@dg.definitions
def dependent_component(context: dg.ComponentLoadContext):
# Load definitions from another component
upstream_module = context.load_defs_relative_python_module(
Path("../upstream_component")
)
upstream_defs = context.load_defs(upstream_module)
@dg.asset(deps=[upstream_defs.assets])
def my_downstream_asset(): ...
# Use upstream assets in this component
return dg.Definitions(
assets=[my_downstream_asset],
# Include upstream definitions if needed
)Note: This context is automatically provided by Dagster’s autoloading system and should not be instantiated manually in most cases. For testing purposes, use
ComponentLoadContext.for_test()
to create a test instance.See also: -
dagster.definitions()
: Decorator that receives this contextdagster.Definitions
: The object typically returned by context-using functionsdagster.components.resolved.context.ResolutionContext
: Underlying resolution context
Building Components
class
dagster.Component [source]Abstract base class for creating Dagster components.
Components are the primary building blocks for programmatically creating Dagster definitions. They enable building multiple interrelated definitions for specific use cases, provide schema-based configuration, and built-in scaffolding support to simplify component instantiation in projects. Components are automatically discovered by Dagster tooling and can be instantiated from YAML configuration files or Python code that conform to the declared schema.
Key Capabilities:
- Definition Factory: Creates Dagster assets, jobs, schedules, and other definitions
- Schema-Based Configuration: Optional parameterization via YAML or Python objects
- Scaffolding Support: Custom project structure generation via
dg scaffold
commands - Tool Integration: Automatic discovery by Dagster CLI and UI tools
- Testing Utilities: Built-in methods for testing component behavior
Implementing a component:
- Every component must implement the
build_defs()
method, which serves as a factory for creating Dagster definitions. - Components can optionally inherit from
Resolvable
to add schema-based configuration capabilities, enabling parameterization through YAML files or structured Python objects. - Components can attach a custom scaffolder with the
@scaffold_with
decorator.
Examples:
Simple component with hardcoded definitions:
import dagster as dg
class SimpleDataComponent(dg.Component):
"""Component that creates a toy, hardcoded data processing asset."""
def build_defs(self, context: dg.ComponentLoadContext) -> dg.Definitions:
@dg.asset
def raw_data():
return [1, 2, 3, 4, 5]
@dg.asset
def processed_data(raw_data):
return [x * 2 for x in raw_data]
return dg.Definitions(assets=[raw_data, processed_data])Configurable component with schema:
import dagster as dg
from typing import List
class DatabaseTableComponent(dg.Component, dg.Resolvable, dg.Model):
"""Component for creating assets from database tables."""
table_name: str
columns: List[str]
database_url: str = "postgresql://localhost/mydb"
def build_defs(self, context: dg.ComponentLoadContext) -> dg.Definitions:
@dg.asset(key=f"{self.table_name}_data")
def table_asset():
# Use self.table_name, self.columns, etc.
return execute_query(f"SELECT {', '.join(self.columns)} FROM {self.table_name}")
return dg.Definitions(assets=[table_asset])Using the component in a YAML file (
defs.yaml
):type: my_project.components.DatabaseTableComponent
attributes:
table_name: "users"
columns: ["id", "name", "email"]
database_url: "postgresql://prod-db/analytics"Component Discovery:
Components are automatically discovered by Dagster tooling when defined in modules specified in your project’s
pyproject.toml
registry configuration:[tool.dagster]
module_name = "my_project"
registry_modules = ["my_project.components"]This enables CLI commands like:
dg list components # List all available components in the Python environment
dg scaffold defs MyComponent path/to/component # Generate component instance with scaffoldingSchema and Configuration:
To make a component configurable, inherit from both
Component
andResolvable
, along with a model base class. Pydantic models and dataclasses are supported largely so that pre-existing code can be used as schema without having to modify it. We recommend usingdg.Model
for new components, which wraps Pydantic with Dagster defaults for better developer experience.dg.Model
: Recommended for new components (wraps Pydantic with Dagster defaults)pydantic.BaseModel
: Direct Pydantic usage@dataclass
: Python dataclasses with validation
Custom Scaffolding:
Components can provide custom scaffolding behavior using the
@scaffold_with
decorator:import textwrap
import dagster as dg
from dagster.components import Scaffolder, ScaffoldRequest
class DatabaseComponentScaffolder(Scaffolder):
def scaffold(self, request: ScaffoldRequest) -> None:
# Create component directory
component_dir = request.target_path
component_dir.mkdir(parents=True, exist_ok=True)
# Generate defs.yaml with template
defs_file = component_dir / "defs.yaml"
defs_file.write_text(
textwrap.dedent(
f'''
type: {request.type_name}
attributes:
table_name: "example_table"
columns: ["id", "name"]
database_url: "${{DATABASE_URL}}"
'''.strip()
)
)
# Generate SQL query template
sql_file = component_dir / "query.sql"
sql_file.write_text("SELECT * FROM example_table;")
@dg.scaffold_with(DatabaseComponentScaffolder)
class DatabaseTableComponent(dg.Component, dg.Resolvable, dg.Model):
table_name: str
columns: list[str]
def build_defs(self, context: dg.ComponentLoadContext) -> dg.Definitions:
# Component implementation
passSee also: -
dagster.Definitions
: The object returned bybuild_defs()
dagster.ComponentLoadContext
: Context provided tobuild_defs()
dagster.components.resolved.base.Resolvable
: Base for configurable componentsdagster.Model
: Recommended base class for component schemasdagster.scaffold_with()
: Decorator for custom scaffolding
class
dagster.Resolvable [source]Base class for making a class resolvable from yaml.
This framework is designed to allow complex nested objects to be resolved from yaml documents. This allows for a single class to be instantiated from either yaml or python without limiting the types of fields that can exist on the python class.
Key Features:
- Automatic yaml schema derivation: A pydantic model is automatically generated from the class definition using its fields or init arguments and their annotations.
- Jinja template resolution: Fields in the yaml document may be templated strings, which are rendered from the available scope and may be arbitrary python objects.
- Customizable resolution behavior: Each field can customize how it is resolved from the yaml document using a :py:class:~dagster.Resolver.
Resolvable subclasses must be one of the following:
- pydantic model
- @dataclass
- plain class with an annotated init
- @record
Example:
import datetime
from typing import Annotated
import dagster as dg
def resolve_timestamp(
context: dg.ResolutionContext,
raw_timestamp: str,
) -> datetime.datetime:
return datetime.datetime.fromisoformat(
context.resolve_value(raw_timestamp, as_type=str),
)
# the yaml field will be a string, which is then parsed into a datetime object
ResolvedTimestamp = Annotated[
datetime.datetime,
dg.Resolver(resolve_timestamp, model_field_type=str),
]
class MyClass(dg.Resolvable, dg.Model):
event: str
start_timestamp: ResolvedTimestamp
end_timestamp: ResolvedTimestamp
# python instantiation
in_python = MyClass(
event="test",
start_timestamp=datetime.datetime(2021, 1, 1, 0, 0, 0, tzinfo=datetime.timezone.utc),
end_timestamp=datetime.datetime(2021, 1, 2, 0, 0, 0, tzinfo=datetime.timezone.utc),
)
# yaml instantiation
in_yaml = MyClass.resolve_from_yaml(
'''
event: test
start_timestamp: '{{ start_year }}-01-01T00:00:00Z'
end_timestamp: '{{ end_timestamp }}'
''',
scope={
# string templating
"start_year": "2021",
# object templating
"end_timestamp": in_python.end_timestamp,
},
)
assert in_python == in_yaml
class
dagster.ResolutionContext [source]The context available to Resolver functions when “resolving” from yaml in to a Resolvable object. This class should not be instantiated directly.
Provides a resolve_value method that can be used to resolve templated values in a nested object before being transformed into the final Resolvable object. This is typically invoked inside a
Resolver
’s resolve_fn to ensure that jinja-templated values are turned into their respective python types using the available template variables.Example:
import datetime
import dagster as dg
def resolve_timestamp(
context: dg.ResolutionContext,
raw_timestamp: str,
) -> datetime.datetime:
return datetime.datetime.fromisoformat(
context.resolve_value(raw_timestamp, as_type=str),
)- resolve_value [source]
Recursively resolves templated values in a nested object. This is typically invoked inside a
Resolver
’s resolve_fn to resolve all nested template values in the input object.Parameters:
- val (Any) – The value to resolve.
- as_type (Optional[type]) – If provided, the type to cast the resolved value to. Used purely for type hinting and does not impact runtime behavior.
Returns: The input value after all nested template values have been resolved.
class
dagster.Resolver [source]Contains information on how to resolve a value from YAML into the corresponding
Resolved
class field.You can attach a resolver to a field’s type annotation to control how the value is resolved.
Example:
import datetime
from typing import Annotated
import dagster as dg
def resolve_timestamp(
context: dg.ResolutionContext,
raw_timestamp: str,
) -> datetime.datetime:
return datetime.datetime.fromisoformat(
context.resolve_value(raw_timestamp, as_type=str),
)
class MyClass(dg.Resolvable, dg.Model):
event: str
# the yaml field will be a string, which is then parsed into a datetime object
timestamp: Annotated[
datetime.datetime,
dg.Resolver(resolve_timestamp, model_field_type=str),
]
class
dagster.Model [source]pydantic BaseModel configured with recommended default settings for use with the Resolved framework.
Extra fields are disallowed when instantiating this model to help catch errors earlier.
Example:
import dagster as dg
class MyModel(dg.Resolvable, dg.Model):
name: str
age: int
# raises exception
MyModel(name="John", age=30, other="field")
Core Models
These Annotated TypeAliases can be used when defining custom Components for common Dagster types.
- dagster.ResolvedAssetKey
:
Annotated[
AssetKey,
...``]
Allows resolving to an AssetKey via a YAML-friendly schema.
- dagster.ResolvedAssetSpec
:
Annotated[
AssetSpec,
...``]
Allows resolving to an AssetSpec via a YAML-friendly schema.
- dagster.AssetAttributesModel
A pydantic modeling of all the attributes of an AssetSpec that can be set before the definition is created.
- dagster.ResolvedAssetCheckSpec
:
Annotated[
AssetCheckSpec,
...``]
Allows resolving to an AssetCheckSpec via a YAML-friendly schema.
Built-in Components
class
dagster.DefsFolderComponent [source]A component that represents a directory containing multiple Dagster definition modules.
DefsFolderComponent serves as a container for organizing and managing multiple subcomponents within a folder structure. It automatically discovers and loads components from subdirectories and files, enabling hierarchical organization of Dagster definitions. This component also supports post-processing capabilities to modify metadata and properties of definitions created by its child components.
Key Features:
- Post-Processing: Allows modification of child component definitions via configuration
- Automatic Discovery: Recursively finds and loads components from subdirectories
- Hierarchical Organization: Enables nested folder structures for complex projects
The component automatically scans its directory for:
- YAML component definitions (
defs.yaml
files) - Python modules containing Dagster definitions
- Nested subdirectories containing more components
Here is how a DefsFolderComponent is used in a project by the framework, along with other framework-defined classes.
my_project/
└── defs/
├── analytics/ # DefsFolderComponent
│ ├── defs.yaml # Post-processing configuration
│ ├── user_metrics/ # User-defined component
│ │ └── defs.yaml
│ └── sales_reports/ # User-defined component
│ └── defs.yaml
└── data_ingestion/ # DefsFolderComponent
├── api_sources/ # DefsFolderComponent
│ └── some_defs.py # DagsterDefsComponent
└── file_sources/ # DefsFolderComponent
└── files.py # DagsterDefsComponentParameters:
- path – The filesystem path to the directory containing child components.
- children – A mapping of child paths to their corresponding Component instances. This is typically populated automatically during component discovery.
DefsFolderComponent supports post-processing through its
defs.yaml
configuration, allowing you to modify definitions created by child components using target selectorsExamples:
Using post-processing in a folder’s
defs.yaml
:# analytics/defs.yaml
type: dagster.components.DefsFolderComponent
post_processing:
assets:
- target: "*" # add a top level tag to all assets in the folder
attributes:
tags:
top_level_tag: "true"
- target: "tag:defs_tag=true" # add a tag to all assets in the folder with the tag "defs_tag"
attributes:
tags:
new_tag: "true"Please see documentation on post processing and the selection syntax for more examples.
Component Discovery:
The component automatically discovers children using these patterns:
- YAML Components: Subdirectories with
defs.yaml
files - Python Modules: Any
.py
files containing Dagster definitions - Nested Folders: Subdirectories that contain any of the above Files and directories matching these patterns are ignored:
__pycache__
directories- Hidden directories (starting with
.
)
Note: DefsFolderComponent instances are typically created automatically by Dagster’s component loading system. Manual instantiation is rarely needed unless building custom loading logic or testing scenarios.
When used with post-processing, the folder’s
defs.yaml
should only contain post-processing configuration, not component type definitions.