flupy

Data pipeline processor

A library that provides a fluent interface for processing data pipelines in Python without holding large amounts of memory

Fluent data pipelines for python and your shell

GitHub

193 stars
8 watching
15 forks
Language: Python
last commit: 2 months ago
collectionsdata-pipelinefluentpython

Related projects:

Repository Description Stars
nazar256/parapipe A non-blocking buffered pipeline library that allows concurrent processing of data while maintaining output order without locks or mutexes. 31
pdpipe/pdpipe A tool for creating and managing data pipelines with pandas DataFrames 716
silascutler/malpipe An ingestion and processing framework for malware and indicator data from various feeds. 103
thephpleague/pipeline Provides a flexible pipeline pattern implementation to compose sequential stages and process payloads in a composable manner. 960
julienpalard/pipe A Python library providing a simple and flexible way to process sequences of data using infix notation. 1,954
apache/datafusion-python A Python library that provides a data processing and querying framework using the Apache Arrow in-memory query engine. 375
ypares/porcupine A tool that enables data manipulation and analysis pipelines to be flexible, reusable, and reproducible in different environments 89
databiosphere/toil A workflow management system designed to efficiently run pipelines in various environments. 901
h2oai/datatable A Python package for manipulating 2-dimensional tabular data structures with an emphasis on speed and big data support. 1,817
deltares/pyflwdir A Python package for fast and efficient hydrological and topographic data processing 75
substantic/rain A framework for processing large-scale task-based pipelines in a distributed manner 748
raine/ramda-cli A tool for composing functions into data-processing pipelines to produce desired output. 573
giacbrd/smartpipeline A framework for designing and executing concurrent data pipelines with a focus on simplicity and efficiency 23
druths/xp A tool for creating flexible and self-documenting data science pipelines 56
huggingface/datatrove A platform-agnostic data processing framework for large-scale text data pipelines 2,043