mario
Data pipeline library
A library that enables the definition of complex data pipelines in a functional, typesafe, and efficient way using a declarative syntax
Functional, Typesafe, Declarative Data Pipelines
139 stars
90 watching
16 forks
Language: Scala
last commit: almost 7 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
joboccara/pipes | A header-only C++14 library for building expressive data pipelines using a chainable interface. | 803 |
darky/rocket-pipes | A TypeScript library that enables the creation of modular, composable, and reusable data processing pipelines | 25 |
giacbrd/smartpipeline | A framework for designing and executing concurrent data pipelines with a focus on simplicity and efficiency | 23 |
galaxyproject/galaxy | An integrated framework for data-intensive scientific analysis and workflow management | 1,410 |
kevin-hanselman/dud | A lightweight tool for managing and versioning large data alongside source code in data pipelines | 183 |
optics-dev/monocle | A Scala library providing a functional programming style interface to manipulate and compose data structures using lenses and other combinatorial techniques. | 1,654 |
nessos/streams | A lightweight library for building efficient data pipelines using functional programming concepts | 383 |
log2timeline/dftimewolf | A framework for orchestrating data collection, processing, and export | 296 |
scalalandio/chimney | A library for boilerplate-free data transformations using type-safe mapping and automatic conversion. | 1,174 |
whitaker-io/machine | A library for creating data workflows that can be simple or complex, with features like recursion and memoization. | 158 |
renkun-ken/piper | Provides functions and methods to chain operations in R, enhancing readability and maintainability of data pipelines. | 169 |
huggingface/datatrove | A platform-agnostic data processing framework for large-scale text data pipelines | 2,043 |
silascutler/malpipe | An ingestion and processing framework for malware and indicator data from various feeds. | 103 |
databiosphere/toil | A workflow management system designed to efficiently run pipelines in various environments. | 901 |
druths/xp | A tool for creating flexible and self-documenting data science pipelines | 56 |