suro
Data pipeline service
A distributed data pipeline service for collecting, aggregating, and dispatching large volumes of application events.
Netflix's distributed Data Pipeline
794 stars
513 watching
171 forks
Language: Java
last commit: over 1 year ago
Linked from 5 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
netflix/turbine | A Java-based system for aggregating and streaming real-time data from various sources | 835 |
netflix/servo | Provides a simple interface to expose and publish Java application metrics using JMX | 1,417 |
apache/streampipes | A toolbox for industrial data analytics and stream processing | 605 |
netflix/staash | A tool to abstract storage details and automate common data access patterns for developers working with relational technologies | 209 |
netflix/genie | An orchestration service that simplifies the process of running Big Data queries by automating the configuration and execution of complex jobs. | 1,716 |
linkedin/brooklin | A distributed system for streaming data between heterogeneous systems with high reliability and throughput at scale | 920 |
netflix/hollow | A high-performance in-memory dataset dissemination toolset for scalable read-only access to data from a single producer. | 1,206 |
datasalt/pangool | A Java framework that simplifies Hadoop's MapReduce API to build efficient data processing pipelines | 57 |
apache/tez | A system that enables flexible data processing pipelines using a low-level engine for higher-level frameworks | 479 |
samapriya/planet-gee-pipeline-cli | A command-line tool for automating data processing and uploads from Planet's API to Google Earth Engine. | 42 |
netflix/evcache | A distributed in-memory caching solution designed to store frequently used data for short-term use cases | 2,058 |
netflix-skunkworks/cloudaux | Provides a unified interface to various cloud providers | 76 |
raystack/firehose | Delivers real-time streaming data to various destinations | 325 |
gazette/core | Enables teams to build platforms mixing SQL, batch, and real-time streaming processing paradigms | 718 |
streamsets/datacollector-oss | A continuous big data ingestion platform that enables easy creation of data pipelines for various data sources and destinations. | 90 |