Skip to main content

Dagster ETL pipeline

In this tutorial, you'll build a full ETL pipeline with Dagster that:

  • Ingests data into DuckDB
  • Transforms data into reports with dbt
  • Runs scheduled reports automatically
  • Generates one-time reports on demand
  • Visualizes the data with Evidence

Prerequisites

To follow the steps in this guide, you'll need:

  • Python 3.9+ and uv installed. For more information, see the Installation guide.
  • Familiarity with Python and SQL.
  • A basic understanding of data pipelines and the extract, transform, and load (ETL) process.

Step 1: Set up your Dagster environment

  1. Open your terminal and scaffold a new Dagster project:

    uvx -U create-dagster project etl-tutorial
  2. Respond y to the prompt to run uv sync after scaffolding

    Responding y to uv sync prompt

  3. Change to the etl-tutorial directory:

    cd etl-tutorial
  4. Activate the virtual environment:

    source .venv/bin/activate

Step 2: Launch the Dagster webserver

To make sure Dagster and its dependencies were installed correctly, navigate to the project root directory and start the Dagster webserver:

dg dev

Next steps