SmartPipeline

Data pipeline framework

A framework for designing and executing concurrent data pipelines with a focus on simplicity and efficiency

A framework for rapid development of robust data pipelines following a simple design pattern

GitHub

23 stars
2 watching
3 forks
Language: Python
last commit: 9 months ago
Linked from 1 awesome list

data-analysisdata-analyticsdata-miningdata-pipelinesdata-processingdata-sciencedataopsdesign-patternsetlmachine-learningmlopspipelinepipeline-frameworkpipelinesreproducibilitytask-queueworkflow

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
huggingface/datatrove A platform-agnostic data processing framework for large-scale text data pipelines 2,043
vectaport/flowgraph A software framework for building scalable, asynchronous data pipelines with explicit back-pressure management and logging capabilities. 60
pdpipe/pdpipe A tool for creating and managing data pipelines with pandas DataFrames 716
log2timeline/dftimewolf A framework for orchestrating data collection, processing, and export 296
ypares/porcupine A tool that enables data manipulation and analysis pipelines to be flexible, reusable, and reproducible in different environments 89
databiosphere/toil A workflow management system designed to efficiently run pipelines in various environments. 901
druths/xp A tool for creating flexible and self-documenting data science pipelines 56
kevin-hanselman/dud A lightweight tool for managing and versioning large data alongside source code in data pipelines 183
mara/mara-pipelines A lightweight ETL framework providing a simple way to define and execute data transformation pipelines using declarative Python code. 2,082
johnsonc/lambdo A workflow engine for unifying feature engineering and machine learning operations in data analysis pipelines 1
galaxyproject/galaxy An integrated framework for data-intensive scientific analysis and workflow management 1,410
m3dev/gokart A framework that solves common problems in machine learning pipeline development and provides an environment for reproducibility and team collaboration. 318
paysure/orinoco A functional composable pipeline framework for Python that separates business logic from implementation. 11
symphony09/ograph A framework for building data pipelines with concurrent execution and dependency management 32
calebwin/pipelines A language and runtime for crafting massively parallel data pipelines 374