Skip to main content

Dagster & Sling with components

The dagster-sling library provides a SlingReplicationCollectionComponent which can be used to easily represent a collection of Sling replications as assets in Dagster.

1. Prepare a Dagster project

To begin, you'll need a Dagster project. You can use an existing components-ready project or create a new one:

create-dagster project my-project && cd my-project/src

Activate the project virtual environment:

source ../.venv/bin/activate

Finally, add the dagster-sling library to the project. We will also add duckdb to use as a destination for our Sling replication.

uv add dagster-sling duckdb

2. Scaffold a Sling component

Now that you have a Dagster project, you can scaffold a Sling component:

dg scaffold defs dagster_sling.SlingReplicationCollectionComponent sling_ingest
Creating defs at /.../my-project/src/my_project/defs/sling_ingest.

The scaffold call will generate a defs.yaml file and a unpopulated Sling replication.yaml file:

tree my_project/defs
my_project/defs
├── __init__.py
└── sling_ingest
├── defs.yaml
└── replication.yaml

2 directories, 3 files

In its scaffolded form, the defs.yaml file contains the configuration for your Sling workspace:

my_project/defs/sling_ingest/defs.yaml
type: dagster_sling.SlingReplicationCollectionComponent

attributes:
replications:
- path: replication.yaml

The generated file is a template, which still needs to be configured:

my_project/defs/sling_ingest/replication.yaml
source: {}
streams: {}
target: {}

3. Configure Sling replications

In the defs.yaml file, you can directly specify a list of Sling connections which you can use in your replications. Here, you can specify a connection to DuckDB:

my_project/defs/sling_ingest/defs.yaml
type: dagster_sling.SlingReplicationCollectionComponent

attributes:
connections:
DUCKDB:
type: duckdb
instance: /tmp/my_project.duckdb
replications:
- path: ./replication.yaml

For this example replication, we will ingest a set of CSV files to DuckDB. You can use curl to download some sample data:

curl -O https://raw.githubusercontent.com/dbt-labs/jaffle-shop-classic/refs/heads/main/seeds/raw_customers.csv &&
curl -O https://raw.githubusercontent.com/dbt-labs/jaffle-shop-classic/refs/heads/main/seeds/raw_orders.csv &&
curl -O https://raw.githubusercontent.com/dbt-labs/jaffle-shop-classic/refs/heads/main/seeds/raw_payments.csv

Next, you can configure Sling replications for each CSV file in replication.yaml:

my_project/defs/sling_ingest/replication.yaml
source: LOCAL
target: DUCKDB

defaults:
mode: full-refresh
object: "{stream_table}"

streams:
file://raw_customers.csv:
object: "main.raw_customers"
file://raw_orders.csv:
object: "main.raw_orders"
file://raw_payments.csv:
object: "main.raw_payments"

Our newly configured Sling component will produce an asset for each replicated file:

dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━┩ │
│ │ │ file_raw_customers/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼─────────┼────────────────────────┼───────┼─────────────┤ │
│ │ │ file_raw_orders/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼─────────┼────────────────────────┼───────┼─────────────┤ │
│ │ │ file_raw_payments/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼─────────┼────────────────────────┼───────┼─────────────┤ │
│ │ │ target/main/raw_customers │ default │ file_raw_customers/csv │ sling │ │ │
│ │ ├───────────────────────────┼─────────┼────────────────────────┼───────┼─────────────┤ │
│ │ │ target/main/raw_orders │ default │ file_raw_orders/csv │ sling │ │ │
│ │ ├───────────────────────────┼─────────┼────────────────────────┼───────┼─────────────┤ │
│ │ │ target/main/raw_payments │ default │ file_raw_payments/csv │ sling │ │ │
│ │ └───────────────────────────┴─────────┴────────────────────────┴───────┴─────────────┘ │
└─────────┴────────────────────────────────────────────────────────────────────────────────────────┘

4. Customize Sling assets

Properties of the assets emitted by each replication can be customized in the defs.yaml file using the translation key:

my_project/defs/sling_ingest/defs.yaml
type: dagster_sling.SlingReplicationCollectionComponent

attributes:
connections:
DUCKDB:
type: duckdb
instance: /tmp/my_project.duckdb
replications:
- path: ./replication.yaml
translation:
group_name: sling_data
description: "Loads data from Sling replication {{ stream_definition.name }}"
dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │
│ │ │ file_raw_customers/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼────────────┼────────────────────────┼───────┼──────────────────────────────┤ │
│ │ │ file_raw_orders/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼────────────┼────────────────────────┼───────┼──────────────────────────────┤ │
│ │ │ file_raw_payments/csv │ default │ │ │ │ │
│ │ ├───────────────────────────┼────────────┼────────────────────────┼───────┼──────────────────────────────┤ │
│ │ │ target/main/raw_customers │ sling_data │ file_raw_customers/csv │ sling │ Loads data from Sling │ │
│ │ │ │ │ │ │ replication │ │
│ │ │ │ │ │ │ file://raw_customers.csv │ │
│ │ ├───────────────────────────┼────────────┼────────────────────────┼───────┼──────────────────────────────┤ │
│ │ │ target/main/raw_orders │ sling_data │ file_raw_orders/csv │ sling │ Loads data from Sling │ │
│ │ │ │ │ │ │ replication │ │
│ │ │ │ │ │ │ file://raw_orders.csv │ │
│ │ ├───────────────────────────┼────────────┼────────────────────────┼───────┼──────────────────────────────┤ │
│ │ │ target/main/raw_payments │ sling_data │ file_raw_payments/csv │ sling │ Loads data from Sling │ │
│ │ │ │ │ │ │ replication │ │
│ │ │ │ │ │ │ file://raw_payments.csv │ │
│ │ └───────────────────────────┴────────────┴────────────────────────┴───────┴──────────────────────────────┘ │
└─────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────┘