dagster
Data pipeline orchestrator
An orchestration platform for data pipelines and assets, providing a declarative programming model and integrated lineage and observability.
An orchestration platform for the development, production, and observation of data assets.
12k stars
123 watching
1k forks
Language: Python
last commit: 1 day ago
Linked from 10 awesome lists
analyticsdagsterdata-engineeringdata-integrationdata-orchestratordata-pipelinesdata-scienceetlmetadatamlopsorchestrationpythonschedulerworkflowworkflow-automation
Backlinks from these awesome lists:
- ethicalml/awesome-production-machine-learning
- runacapital/awesome-oss-alternatives
- oxnr/awesome-bigdata
- igorbarinov/awesome-data-engineering
- pditommaso/awesome-pipeline
- kelvins/awesome-mlops
- gunnarmorling/awesome-opensource-data-engineering
- vihar/awesome-oss-saas
- kelvins/awesome-dataops
- simomay/find-oss
Related projects:
Repository | Description | Stars |
---|---|---|
moby/datakit | A tool to orchestrate applications using a version-controlled dataflow | 1,082 |
databand-ai/dbnd | A framework for building and tracking data pipelines to simplify data engineering workflows | 251 |
pipefunc/pipefunc | Automates and simplifies the creation of function pipelines for efficient execution of scientific workflows. | 215 |
it4innovations/hyperloom | A platform for defining and executing scientific pipelines in distributed environments using C++ and Python. | 16 |
streamsets/datacollector-oss | A continuous big data ingestion platform that enables easy creation of data pipelines for various data sources and destinations. | 90 |
apache/airflow | A platform to programmatically author, schedule and monitor complex workflows | 37,265 |
huawei/containerops | An orchestration platform for automating DevOps workflows by combining tools and services into a single, GUI-based solution | 338 |
dagworks-inc/hamilton | Helps define and manage data transformations with a modular, self-documenting, and portable framework for directed acyclic graphs (DAGs) of data transformations. | 1,876 |
synacker/daggy | A utility and developer library for data streams catching and aggregation | 153 |
danielgerlag/conductor | A distributed workflow management system that coordinates services and scripts into complex workflows. | 532 |
galaxyproject/galaxy | An integrated framework for data-intensive scientific analysis and workflow management | 1,410 |
dataman-cloud/swan | A Mesos scheduler that enables deployment and management of long-running applications with high availability and scalability. | 409 |
kevin-hanselman/dud | A lightweight tool for managing and versioning large data alongside source code in data pipelines | 183 |
couler-proj/couler | Provides a unified interface for constructing and managing workflows across different workflow engines. | 915 |
apache/streampipes | A toolbox for industrial data analytics and stream processing | 607 |