dagster

Data pipeline orchestrator

An orchestration platform for data pipelines and assets, providing a declarative programming model and integrated lineage and observability.

An orchestration platform for the development, production, and observation of data assets.

GitHub

12k stars
123 watching
1k forks
Language: Python
last commit: 1 day ago
Linked from 10 awesome lists

analyticsdagsterdata-engineeringdata-integrationdata-orchestratordata-pipelinesdata-scienceetlmetadatamlopsorchestrationpythonschedulerworkflowworkflow-automation

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
moby/datakit A tool to orchestrate applications using a version-controlled dataflow 1,082
databand-ai/dbnd A framework for building and tracking data pipelines to simplify data engineering workflows 251
pipefunc/pipefunc Automates and simplifies the creation of function pipelines for efficient execution of scientific workflows. 215
it4innovations/hyperloom A platform for defining and executing scientific pipelines in distributed environments using C++ and Python. 16
streamsets/datacollector-oss A continuous big data ingestion platform that enables easy creation of data pipelines for various data sources and destinations. 90
apache/airflow A platform to programmatically author, schedule and monitor complex workflows 37,265
huawei/containerops An orchestration platform for automating DevOps workflows by combining tools and services into a single, GUI-based solution 338
dagworks-inc/hamilton Helps define and manage data transformations with a modular, self-documenting, and portable framework for directed acyclic graphs (DAGs) of data transformations. 1,876
synacker/daggy A utility and developer library for data streams catching and aggregation 153
danielgerlag/conductor A distributed workflow management system that coordinates services and scripts into complex workflows. 532
galaxyproject/galaxy An integrated framework for data-intensive scientific analysis and workflow management 1,410
dataman-cloud/swan A Mesos scheduler that enables deployment and management of long-running applications with high availability and scalability. 409
kevin-hanselman/dud A lightweight tool for managing and versioning large data alongside source code in data pipelines 183
couler-proj/couler Provides a unified interface for constructing and managing workflows across different workflow engines. 915
apache/streampipes A toolbox for industrial data analytics and stream processing 607