Organizing your Dagster project

There are many ways to structure your Dagster project, and it can be difficult to know where to start. In this guide, we will walk you through our recommendations for organizing your Dagster project.

Initial project structure

When you first create a project using the create-dagster CLI, it looks like the following:

tree my-project

my-project
├── pyproject.toml
├── README.md
├── src
│   └── my_project
│       ├── __init__.py
│       ├── definitions.py
│       └── defs
│           └── __init__.py
├── tests
│   └── __init__.py
└── uv.lock

5 directories, 7 files

tree my-project

my-project
├── pyproject.toml
├── README.md
├── src
│   └── my_project
│       ├── __init__.py
│       ├── definitions.py
│       └── defs
│           └── __init__.py
└── tests
    └── __init__.py

5 directories, 6 files

tip

To use tree, install it with brew install tree (Mac), or follow the installation instructions (Windows and Linux).

This is a reasonable structure when you are first getting started. However, as you begin to introduce more assets, jobs, resources, sensors, and utility code, you may find that your Python files are growing too large to manage.

Reorganizing your project

Deciding how to organize your project is often influenced by how you and your team members operate. This guide will outline two possible project structures: organized by technology, and organized by concept.

Organized by technology
Organized by concept

Data engineers often have a strong understanding of the underlying technologies that are used in their data pipelines. Because of that, it's often helpful to organize your project by technology. This enables engineers to easily navigate the codebase and locate files pertaining to the specific technology.

Within the technology modules, submodules can be created to further organize your code.

tree my-project

my-project
├── pyproject.toml
├── README.md
├── src
│   └── my_project
│       ├── __init__.py
│       ├── definitions.py
│       └── defs
│           ├── __init__.py
│           ├── dbt
│           │   ├── assets.py
│           │   └── resources.py
│           └── dlt
│               ├── assets.py
│               ├── pipelines
│               │   ├── github.py
│               │   └── hubspot.py
│               └── resources.py
├── tests
│   └── __init__.py
└── uv.lock

8 directories, 13 files

tree my-project

my-project
├── pyproject.toml
├── README.md
├── src
│   └── my_project
│       ├── __init__.py
│       ├── definitions.py
│       └── defs
│           ├── __init__.py
│           ├── dbt
│           │   ├── assets.py
│           │   └── resources.py
│           └── dlt
│               ├── assets.py
│               ├── pipelines
│               │   ├── github.py
│               │   └── hubspot.py
│               └── resources.py
└── tests
    └── __init__.py

8 directories, 12 files

You can also organize your project by data processing concept -- for example, data transformation, ingestion, or processing. This provides additional context to engineers who may not be as familiar with the underlying technologies.

tree my-project

my-project
├── pyproject.toml
├── README.md
├── src
│   └── my_project
│       ├── __init__.py
│       ├── definitions.py
│       └── defs
│           ├── __init__.py
│           ├── ingestion
│           │   └── dlt
│           │       ├── assets.py
│           │       └── resources.py
│           └── transformation
│               ├── adhoc
│               │   ├── assets.py
│               │   └── resources.py
│               └── dbt
│                   ├── assets.py
│                   ├── partitions.py
│                   └── resources.py
├── tests
│   └── __init__.py
└── uv.lock

10 directories, 14 files

tree my-project

my-project
├── pyproject.toml
├── README.md
├── src
│   └── my_project
│       ├── __init__.py
│       ├── definitions.py
│       └── defs
│           ├── __init__.py
│           ├── ingestion
│           │   └── dlt
│           │       ├── assets.py
│           │       └── resources.py
│           └── transformation
│               ├── adhoc
│               │   ├── assets.py
│               │   └── resources.py
│               └── dbt
│                   ├── assets.py
│                   ├── partitions.py
│                   └── resources.py
└── tests
    └── __init__.py

10 directories, 13 files

External projects

As your data platform evolves, you can integrate other data tools, such as dbt, Sling, or Jupyter notebooks.

We recommended storing these projects outside your Dagster project, as demonstrated in the dbt_project example below.

.
├── dbt_project/
│   ├── config/
│   │   └── profiles.yml
│   ├── dbt_project.yml
│   ├── macros/
│   │   ├── aggregate_actions.sql
│   │   └── generate_schema_name.sql
│   ├── models/
│   │   ├── activity_analytics/
│   │   │   ├── activity_daily_stats.sql
│   │   │   ├── comment_daily_stats.sql
│   │   │   └── story_daily_stats.sql
│   │   ├── schema.yml
│   │   └── sources.yml
│   └── tests/
│       └── assert_true.sql
└── example-dagster-project/

Using a workspace to manage multiple projects

This guide outlines how to structure a single Dagster project. Most people will only need one project. However, Dagster also allows you to create a workspace with multiple projects.

A helpful pattern uses a workspace with multiple projects to separate conflicting dependencies, where each project has its own package requirements and deployment specs. For more information, see Creating workspaces to manage multiple projects.

Initial project structure​

Reorganizing your project​

External projects​

Initial project structure

Reorganizing your project

External projects