Ingest data with Sling
Sling is a data movement tool that replicates tables between databases, data warehouses, and file systems. In this step, you'll configure Sling to copy tables from Postgres to DuckDB and register each table as a Dagster asset.
Step 1: Scaffold the Sling component
Use dg scaffold to create the component folder:
dg scaffold defs dagster_sling.SlingReplicationCollectionComponent sling_ingest
This creates a sling_ingest folder with two files:
src/project_elt_pipeline/defs/
└── sling_ingest/
├── defs.yaml
└── replication.yaml
defs.yaml already has the necessary configuration and is ready to use:
type: dagster_sling.SlingReplicationCollectionComponent
attributes:
replications:
- path: replication.yaml
replication.yaml is where you define what to replicate and where to send it, which we cover in the next step.
Step 2: Configure the replication
Step 2.1: Update replication.yaml
Replace the contents of defs/sling_ingest/replication.yaml with your Postgres-to-DuckDB config:
source: POSTGRES_SOURCE
target: DUCKDB_TARGET
defaults:
mode: full-refresh
object: "{stream_schema}_{stream_table}"
streams:
public.users:
mode: incremental
primary_key: id
update_key: updated_at
object: users
public.orders:
mode: incremental
primary_key: id
update_key: updated_at
object: orders
public.products:
mode: full-refresh
object: products
env:
POSTGRES_SOURCE: |
type: postgres
host: ${POSTGRES_HOST}
port: ${POSTGRES_PORT:-5432}
database: ${POSTGRES_DATABASE}
user: ${POSTGRES_USER}
password: ${POSTGRES_PASSWORD}
sslmode: ${POSTGRES_SSLMODE:-prefer}
DUCKDB_TARGET: |
type: duckdb
instance: ${DEST_DUCKDB_PATH:-./data.duckdb}
A few things to note:
POSTGRES_SOURCEandDUCKDB_TARGETare the connection names used insource:andtarget:at the top. Sling reads their definitions from theenv:block.mode: incrementaltracks changes since the last run usingupdate_key, whilefull-refreshreplaces the table on every run.- The
object:field sets the destination table name.
Step 2.2: Set connection details
Set the connection details in a .env file at your project root:
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_DATABASE=mydb
POSTGRES_USER=myuser
POSTGRES_PASSWORD=mypassword
DEST_DUCKDB_PATH=./data.duckdb
Step 3: Verify the component
Run dg check defs to confirm the component loads without errors:
dg check defs
Step 4: View assets in Dagster
Reload definitions in the Dagster UI (Deployment > Reload definitions). You should see three new assets grouped under sling_ingest: users, orders, and products. Each asset corresponds to one stream in replication.yaml.
With a running Postgres instance and your .env file in place, click Materialize all to run the replication.
How assets are generated
SlingReplicationCollectionComponent reads replication.yaml and creates one Dagster asset per stream — no @asset functions required. Add a stream to replication.yaml and a new asset appears in the graph. Remove one and it disappears.
Next steps
- Continue this example by adding dlt ingestion