capillaries

Data processor

A distributed batch data processing framework that enables scalable and reliable data transformation, filtering, and aggregation.

Distributed batch data processing framework

GitHub

62 stars

0 watching

2 forks

Language: Go

last commit: almost 2 years ago

Linked from 1 awesome list

batch-processingcassandradagdistributed-computingdistributed-systemsgogolangrabbitmqrelational-algebraworkflow-engineworkflows

Backlinks from these awesome lists:

avelino/awesome-go

Related projects:

Repository	Description	Stars
apache/datafusion-python	A Python library that provides a data processing and querying framework using the Apache Arrow in-memory query engine.	385
cmassiot/upipe	A framework for organizing and processing multimedia data in a modular and flexible way	1
quixio/quix-streams	A Python framework for real-time data processing on Apache Kafka streams	1,246
whitaker-io/machine	A library for creating data workflows that can be simple or complex, with features like recursion and memoization.	159
tsherwen/ac_tools	A package of tools and functions for processing and analyzing atmospheric model output and observational data.	14
kapolos/pramda	A PHP implementation of functional programming concepts to simplify data processing and analysis.	245
johnsonc/lambdo	A workflow engine for unifying feature engineering and machine learning operations in data analysis pipelines	1
vertica/distributedr	A high-performance platform for large-scale R data processing and analytics	163
snoyberg/conduit	A framework for handling and transforming streaming data in a consistent and efficient way	903
databiosphere/toil	A workflow management system designed to efficiently run pipelines in various environments.	901
castagna/jena-grande	A collection of utilities and examples for processing RDF data using various big-data technologies.	24
h2oai/datatable	A Python package for manipulating 2-dimensional tabular data structures with an emphasis on speed and big data support.	1,821
wallaroolabs/wally	A distributed stream processing framework for real-time data reactions	1,477
cube2222/jql	A JSON query processor with a custom syntax that simplifies complex queries by breaking them down into step-by-step operations.	895
mehd-io/pypi-duck-flow	A data engineering project that extracts insights from Python projects using DuckDB and MotherDuck.	173