Post-processing components
It is often useful to make modifications to the definitions generated by a component without needing to modify the component logic. Dagster provides a generic mechanism for this called post-processing.
Post-processing is available on all components. To add post-processing to a
component instance, add a post_process field in defs.yaml.
Currently post-processing is only supported for the assets, not other definitions.
Setup
Let's look at a simple example using the DefsFolderComponent. DefsFolderComponent
simply loads all definitions from a specified folder.
Starting from a blank project, let's scaffold a DefsFolderComponent called
my_assets:
dg scaffold defs DefsFolderComponent my_assets
Creating defs at /.../my-project/src/my_project/defs/my_assets.
We now have a directory my_project/defs/my_assets with a single file,
defs.yaml:
type: dagster.DefsFolderComponent
attributes: {}
Let's add some assets. We'll create two files
my_project/defs/my_assets/foo.py and my_project/defs/my_assets/bar.py, each
containing a single asset:
- foo.py
- bar.py
import dagster as dg
@dg.asset
def foo():
return None
import dagster as dg
@dg.asset
def bar():
return None
Let's run dg list defs to see our assets:
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━┩ │
│ │ │ bar │ default │ │ │ │ │
│ │ ├─────┼─────────┼──────┼───────┼─────────────┤ │
│ │ │ foo │ default │ │ │ │ │
│ │ └─────┴─────────┴──────┴───────┴─────────────┘ │
└─────────┴────────────────────────────────────────────────┘
Example 1: Adding kind tags to assets
Now suppose we want to add a compute kind to every asset defined in this folder.
We could do this by manually adding the kind on each asset declaration or by using
a factory. However, component post-processing provides a simpler solution. We
modify our defs.yaml to add a post_processing field that specifies the
kind:
type: dagster.DefsFolderComponent
attributes: {}
post_processing:
assets:
- attributes:
kinds:
- "some_kind"
Let's break down the structure of the value we set for post_processing. The
top-level key, assets, is currently the only supported key. assets holds a
list of asset post-processors. Each post-processor transforms a set of asset
attributes and applies to a subset of all of the assets generated by the
component. In this case, we have a single post-processor with no defined
subset, which means the specified transformation is applied to all assets.
Let's run dg list defs again to see the result:
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃