Skip to main content

Machine learning with PyTorch

In this example, you'll build a complete CNN-based digit classifier that:

  • Build production-ready ML pipelines using Dagster's asset-based architecture
  • Train and deploy CNN models with automated quality gates and rollback capabilities
  • Implement configurable training workflows that adapt across development and production environments
  • Create scalable inference services supporting both batch and real-time prediction scenarios

Prerequisites

To follow the steps in this guide, you'll need:

  • Basic Python knowledge
  • Python 3.9+ installed on your system. Refer to the Installation guide for information.
  • Basic familiarity with machine learning concepts (neural networks, training/validation splits)
  • Understanding of PyTorch fundamentals (tensors, models, training loops)

Step 1: Set up your Dagster environment

First, set up a new Dagster project with the ML dependencies.

  1. Clone the Dagster repo and navigate to the project:

    cd examples/docs_projects/project_ml
  2. Install the required dependencies with uv:

    uv sync
  3. Activate the virtual environment:

    source .venv/bin/activate

Step 2: Launch the Dagster webserver

To make sure Dagster and its dependencies were installed correctly, navigate to the project root directory and start the Dagster webserver:

dg dev

Next steps