hamilton

Dataflow framework

Helps define and manage data transformations with a modular, self-documenting, and portable framework for directed acyclic graphs (DAGs) of data transformations.

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

GitHub

2k stars
17 watching
127 forks
Language: Jupyter Notebook
last commit: about 1 month ago
Linked from 4 awesome lists

dagdata-analysisdata-engineeringdata-sciencedataframeetletl-frameworketl-pipelinefeature-engineeringhacktoberfestlineagellmopsmachine-learningmlopsorchestrationpandaspythonragsoftware-engineering

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
nipype/pydra A lightweight Python dataflow engine for building and executing directed acyclic graphs (DAGs) in a scalable manner. 123
pipefunc/pipefunc Automates and simplifies the creation of function pipelines for efficient execution of scientific workflows. 230
dagworks-inc/burr A framework for building applications with state management and decision-making capabilities using LLMs and graphs 1,368
rhosocial/go-dag A framework for managing and executing workflows described by directed acyclic graphs. 23
chunelfeng/cgraph A cross-platform framework for building and executing directed acyclic graphs (DAGs) in C++. 1,815
man-group/mdf A toolkit for expressing programs as directed acyclic graphs and wiring together computations over time-series data. 169
symphony09/ograph A framework for building data pipelines with concurrent execution and dependency management 33
graphprotocol/graph-client A library and toolset for building fast, performant GraphQL-based decentralized applications 177
eclipse-zenoh-flow/zenoh-flow A framework for declarative data flow programming and edge computing 92
daostack/arc A platform providing a modular, upgradeable infrastructure for decentralized autonomous organizations (DAOs) on the Ethereum blockchain. 47
yadage/adage A package to dynamically build and manage directed acyclic graphs (DAGs) of tasks that can be executed in parallel or sequentially. 56
erikbrinkman/d3-dag A library that provides a data structure and algorithms for visualizing directed acyclic graphs 1,460
dagster-io/dagster An orchestration platform for data pipelines and assets, providing a declarative programming model and integrated lineage and observability. 12,055
jexia/semaphore A tool to manage and expose complex data flows through multiple protocols. 94
google/digitalbuildings Provides tools and an ontology for representing and managing structured information about buildings and equipment. 375