databus
Data pipeline system
A distributed system to capture changes from primary data stores and route them through complex data pipelines.
Source-agnostic distributed change data capture system
4k stars
380 watching
735 forks
Language: Java
last commit: about 1 year ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
datahub-project/datahub | A platform for managing and discovering data across an organization's data stack | 9,995 |
debezium/debezium | A platform that captures and streams data changes from relational databases into Kafka topics | 10,773 |
linkedin/brooklin | A distributed system for streaming data between heterogeneous systems with high reliability and throughput at scale | 924 |
linkedin/ambry | A distributed object store designed to efficiently store and serve small to large immutable objects in high availability and horizontal scalability. | 1,748 |
mit-pdos/noria | A streaming data-flow system designed to act as a fast storage backend for read-heavy web applications by caching relational query results. | 5,008 |
databendlabs/databend | An open-source cloud-based data warehouse built on Rust with a focus on high-performance analytics and scalable storage | 7,912 |
ha/doozerd | A distributed data store with real-time updates and high availability features. | 3,268 |
netdata/netdata | An observability platform designed to monitor and analyze systems in real-time with automated anomaly detection and root cause analysis. | 72,321 |
linkedin/cruise-control | Automates dynamic workload rebalance and self-healing of large-scale Apache Kafka clusters | 2,769 |
voldemort/voldemort | A distributed key-value storage system with data replication, partitioning, and versioning. | 2,640 |
lindb/lindb | A high-performance, distributed time series database with horizontal scalability and high availability | 3,010 |
cube-js/cube | A platform for building data applications with efficient data modeling, access control, and performance optimizations. | 18,026 |
arch/autohistory | Automatically records and tracks changes to data in databases using Microsoft.EntityFrameworkCore | 785 |
pudo/dataset | A Python library simplifying data handling in SQL databases with features like implicit table creation and bulk loading. | 4,783 |
linkedinattic/datafu | A collection of libraries for working with large-scale data in Hadoop, providing incremental processing capabilities and user-defined functions. | 583 |