databus

Data pipeline system

A distributed system to capture changes from primary data stores and route them through complex data pipelines.

Source-agnostic distributed change data capture system

GitHub

4k stars

381 watching

736 forks

Language: Java

last commit: almost 3 years ago

Linked from 1 awesome list

Backlinks from these awesome lists:

manuzhang/awesome-streaming

Related projects:

Repository	Description	Stars
datahub-project/datahub	A platform for managing and discovering data across an organization's data stack	10,046
debezium/debezium	A platform that captures and streams data changes from relational databases into Kafka topics	10,836
linkedin/brooklin	A distributed system for streaming data between heterogeneous systems with high reliability and throughput at scale	931
linkedin/ambry	A distributed object store designed to efficiently store and serve large media objects in web applications.	1,749
mit-pdos/noria	A streaming data-flow system designed to act as a fast storage backend for read-heavy web applications by caching relational query results.	5,008
databendlabs/databend	A high-performance, scalable data warehouse built on Rust, offering blazing-fast query execution and real-time analytics capabilities.	7,978
ha/doozerd	A distributed data store with real-time updates and high availability features.	3,269
netdata/netdata	A high-performance observability platform designed to simplify modern infrastructure monitoring and provide real-time insights into systems and applications.	72,607
linkedin/cruise-control	Automates dynamic workload rebalance and self-healing of large-scale Apache Kafka clusters	2,777
voldemort/voldemort	A distributed key-value storage system with data replication, partitioning, and versioning.	2,642
lindb/lindb	A high-performance, distributed time series database with horizontal scalability and high availability	3,010
cube-js/cube	A platform for building data applications with efficient data modeling, access control, and performance optimizations.	18,068
arch/autohistory	Automatically records and tracks changes to data in databases using Microsoft.EntityFrameworkCore	785
pudo/dataset	A Python library simplifying data handling in SQL databases with features like implicit table creation and bulk loading.	4,785
linkedinattic/datafu	A collection of libraries for working with large-scale data in Hadoop, providing incremental processing capabilities and user-defined functions.	583