mario

Data pipeline library

A library that enables the definition of complex data pipelines in a functional, typesafe, and efficient way using a declarative syntax

Functional, Typesafe, Declarative Data Pipelines

GitHub

139 stars
90 watching
16 forks
Language: Scala
last commit: almost 7 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
joboccara/pipes A header-only C++14 library for building expressive data pipelines using a chainable interface. 803
darky/rocket-pipes A TypeScript library that enables the creation of modular, composable, and reusable data processing pipelines 25
giacbrd/smartpipeline A framework for designing and executing concurrent data pipelines with a focus on simplicity and efficiency 23
galaxyproject/galaxy An integrated framework for data-intensive scientific analysis and workflow management 1,410
kevin-hanselman/dud A lightweight tool for managing and versioning large data alongside source code in data pipelines 183
optics-dev/monocle A Scala library providing a functional programming style interface to manipulate and compose data structures using lenses and other combinatorial techniques. 1,654
nessos/streams A lightweight library for building efficient data pipelines using functional programming concepts 383
log2timeline/dftimewolf A framework for orchestrating data collection, processing, and export 296
scalalandio/chimney A library for boilerplate-free data transformations using type-safe mapping and automatic conversion. 1,174
whitaker-io/machine A library for creating data workflows that can be simple or complex, with features like recursion and memoization. 158
renkun-ken/piper Provides functions and methods to chain operations in R, enhancing readability and maintainability of data pipelines. 169
huggingface/datatrove A platform-agnostic data processing framework for large-scale text data pipelines 2,043
silascutler/malpipe An ingestion and processing framework for malware and indicator data from various feeds. 103
databiosphere/toil A workflow management system designed to efficiently run pipelines in various environments. 901
druths/xp A tool for creating flexible and self-documenting data science pipelines 56