capillaries
Data processor
A distributed batch data processing framework that enables scalable and reliable data transformation, filtering, and aggregation.
Distributed batch data processing framework
62 stars
0 watching
2 forks
Language: Go
last commit: 4 months ago
Linked from 1 awesome list
batch-processingcassandradagdistributed-computingdistributed-systemsgogolangrabbitmqrelational-algebraworkflow-engineworkflows
Related projects:
Repository | Description | Stars |
---|---|---|
apache/datafusion-python | A Python library that provides a data processing and querying framework using the Apache Arrow in-memory query engine. | 385 |
cmassiot/upipe | A framework for organizing and processing multimedia data in a modular and flexible way | 1 |
quixio/quix-streams | A Python framework for real-time data processing on Apache Kafka streams | 1,246 |
whitaker-io/machine | A library for creating data workflows that can be simple or complex, with features like recursion and memoization. | 159 |
tsherwen/ac_tools | A package of tools and functions for processing and analyzing atmospheric model output and observational data. | 14 |
kapolos/pramda | A PHP implementation of functional programming concepts to simplify data processing and analysis. | 245 |
johnsonc/lambdo | A workflow engine for unifying feature engineering and machine learning operations in data analysis pipelines | 1 |
vertica/distributedr | A high-performance platform for large-scale R data processing and analytics | 163 |
snoyberg/conduit | A framework for handling and transforming streaming data in a consistent and efficient way | 903 |
databiosphere/toil | A workflow management system designed to efficiently run pipelines in various environments. | 901 |
castagna/jena-grande | A collection of utilities and examples for processing RDF data using various big-data technologies. | 24 |
h2oai/datatable | A Python package for manipulating 2-dimensional tabular data structures with an emphasis on speed and big data support. | 1,821 |
wallaroolabs/wally | A distributed stream processing framework for real-time data reactions | 1,477 |
cube2222/jql | A JSON query processor with a custom syntax that simplifies complex queries by breaking them down into step-by-step operations. | 895 |
mehd-io/pypi-duck-flow | A data engineering project that extracts insights from Python projects using DuckDB and MotherDuck. | 173 |