capillaries
Data processor
A distributed batch data processing framework that enables scalable and reliable data transformation, filtering, and aggregation.
Distributed batch data processing framework
61 stars
0 watching
2 forks
Language: Go
last commit: about 2 months ago
Linked from 1 awesome list
batch-processingcassandradagdistributed-computingdistributed-systemsgogolangrabbitmqrelational-algebraworkflow-engineworkflows
Related projects:
Repository | Description | Stars |
---|---|---|
apache/datafusion-python | A Python library that provides a data processing and querying framework using the Apache Arrow in-memory query engine. | 375 |
cmassiot/upipe | A dataflow framework designed to process multimedia data buffers in a flexible and modular way. | 1 |
quixio/quix-streams | A Python framework for real-time data processing on Apache Kafka streams | 1,190 |
whitaker-io/machine | A library for creating data workflows that can be simple or complex, with features like recursion and memoization. | 158 |
tsherwen/ac_tools | A package of tools and functions for processing and analyzing atmospheric model output and observational data. | 13 |
kapolos/pramda | A PHP implementation of functional programming concepts to simplify data processing and analysis. | 245 |
johnsonc/lambdo | A workflow engine for unifying feature engineering and machine learning operations in data analysis pipelines | 1 |
vertica/distributedr | A high-performance platform for large-scale R data processing and analytics | 163 |
snoyberg/conduit | A framework for handling and transforming streaming data in a consistent and efficient way | 903 |
databiosphere/toil | A workflow management system designed to efficiently run pipelines in various environments. | 901 |
castagna/jena-grande | A collection of utilities and examples for processing RDF data using various big-data technologies. | 24 |
h2oai/datatable | A Python package for manipulating 2-dimensional tabular data structures with an emphasis on speed and big data support. | 1,817 |
wallaroolabs/wally | A distributed stream processing framework for real-time data reactions | 1,480 |
cube2222/jql | A JSON query processor with a custom syntax that simplifies complex queries by breaking them down into step-by-step operations. | 896 |
mehd-io/pypi-duck-flow | A project to build data pipelines and visualizations for analyzing Python package download data from PyPi. | 148 |