pdpipe

Data pipeline builder

A tool for creating and managing data pipelines with pandas DataFrames

Easy pipelines for pandas DataFrames.

GitHub

716 stars
18 watching
45 forks
Language: Jupyter Notebook
last commit: 21 days ago
Linked from 1 awesome list

datadata-sciencedataframedataframespandaspandas-dataframepipeline

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
ypares/porcupine A tool that enables data manipulation and analysis pipelines to be flexible, reusable, and reproducible in different environments 89
giacbrd/smartpipeline A framework for designing and executing concurrent data pipelines with a focus on simplicity and efficiency 23
pwwang/pipen A Python-based workflow automation framework that enables easy creation of data processing pipelines 103
olirice/flupy A library that provides a fluent interface for processing data pipelines in Python without holding large amounts of memory 193
kevin-hanselman/dud A lightweight tool for managing and versioning large data alongside source code in data pipelines 183
huggingface/datatrove A platform-agnostic data processing framework for large-scale text data pipelines 2,043
e2niee/pandapipes A tool for simulating multi-energy grids by calculating pipe flows in various energy systems. 149
druths/xp A tool for creating flexible and self-documenting data science pipelines 56
darky/rocket-pipes A TypeScript library that enables the creation of modular, composable, and reusable data processing pipelines 25
headline-design/pipeline-ui A suite of reusable React components to simplify the development of decentralized Algorand applications. 30
log2timeline/dftimewolf A framework for orchestrating data collection, processing, and export 296
moby/datakit A tool to orchestrate applications using a version-controlled dataflow 1,082
renkun-ken/piper Provides functions and methods to chain operations in R, enhancing readability and maintainability of data pipelines. 169
dr-leo/pandasdmx Provides tools to access and manipulate SDMX-compliant data in various formats 127
prodmodel/prodmodel A tool for managing data science pipelines by automating build, testing, and deployment processes while ensuring correctness and performance. 59